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Claims 1-40 
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Claims MO 
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NO 
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cocxpression of the nucleic acid sequence encoding protein X .and dioi ox glycerol dehydratase in a microorganism to 
enhance the stability and the activity of the dehydratase m vivo. The transformed microorganism is more efficient in 
producing X3 -propanediol. 
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WHAT IS CLAIMED IS: 

1. An improved method for the production of 1,3-propanediol from a microorganism 
comprising the steps of: 

, ;? :- 3) obtaining 3 recombinant micrao rgani sm capable of producing 1«3- 
7 propanediol, said microorganism comprising at Jeast one nucleic acid encoding a 
-^-dehydratase activity and a nucleic acid encoding protein X; and - f 

b) nurturing the recombinant microorganism in the presence of at least one 
" carbon source capable of being converted to 1 ,3 propanediol in said transformed 
microorganism and under conditions suitable for the production of 1,3 propanediol 
, , wherein the carbon source is selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and a one carbon substrate. 

.2. " ;Ttie method of Claim 1 wherein:said riecombinanl-miCTOorganism comprises at Jeastone 
nucleic acid encoding a protein selected from the group consisting of protein 1;, protein 2 and 

.protein. 3. 

3. The method of Claim 1 further comprising the step of recovering the 1 ,3 propanediol. 

4. The method of Claim 1 wherein the nucleic acid encoding protein X is isolated from a 
glycerol dehydratase gene cluster. 

5. The method of Claim 1 wherein the nucleic acid encoding protein X is isolated from a 
diol dehydratase gene cluster. 

6. The method of Claim 4 wherein^e glycerol-dehydratase gene cluster.is.from-an ' 
organism selected from the genera consisting_of Klebsiella and Citrobactor. 

7. The method of Claim 5 wherein the diol dehydratase gene cluster is from an-organism 
selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

B. The method of Claim 1 wherern the nucleic acid encoding a dehydratase activity is 
heterologous to the organism. 

9. The method of Claim 1 wherein the nucleic acid encoding a dehydratase activity is 
homologous to the organism. 
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10. The method of Claim 1 wherein the recombinant microorganism is selected from the 
group of genera consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, 
Lactobacillus, Aspergillus, Saccharomyces, Schizosaccharomyc*s % Zygosaccharomyces, Pichia, 
Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Torulopsis x Methylobacter, 

-Escherichia, SalmQn£lte^Racitlus^.Streptomycesar& 

1 1 . sr. The method of Claim 10 wherein the microorganism is selec ted fi oro the~group 
consisting of.£. coli and Klebsiella spp. 

12. The method of Claim 1 wherein the nucleic acid encoding protein X is stably maintained 
in the host genome. 

_.i£3-.\ .The method .af.Clairn-2 where in. at *east;onemtcJeic acid encoding aprotein selected 
fromprotem 1 f protein^ and pnrtein:3;is stabJy maintained in the host genome. 

-.,44. . JThe method of JCJaim 1. wherein the carbon source is glucose. 

15. * " The methodof Claim "1 wherein the nucleic acid encoding protein X has the sequence as 

shown in SEQ ID NO:. 59. ; - 

16. The method of Claim 2 wherein protein 1 has the sequence as shown in SEQ ID NO: €0 
• or SEQ ID NO: 61. 

17. The method of Claim 2 wherein protein 2 has the sequence as shown in SEQ ID NO: 62 
.^r SEQ ID NO: 63. 

18. The method of Claim 2 wherein protein 3 has the sequence as shown in SEQ ID NO:64 
or SEQ ID NO: 65. 

19. A recombinant microorganism capable of producing 1 ,3-propanediol from a .carbon 
source said recombinant microorganism comprising;^) at least or^.iuit^eic acid encoding a 
iiehydratase activity; b) at least one nucleic 3cid encoding a glycerols-phosphatase; :andc) at 
Jeast cne nucleic acid encoding protein X. 



20. The recombinant microorganism of Claim 19 further comprising d) at least one nucleic 
acid encoding a protein selected from the group consisting of protein 1 , protein 2 and protein 3. 
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21. The recombinant microorganism of Claim 19 selected from the group-consisting of 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor t Torulopsis, Methylobacter, Escherichia, Salmonella, 
„ BaciUus^Streptomyces.and Pseudomonas. 

j22. .-.v.. The recombinant microorganism of Claim 19 wherein the nucleic acid encoding protein 
. Xis isolated. from a glycerol dehydratase gene cluster. 

23. , The recombinant microorganism of Claim 19 wherein the nucleic acid encoding protein 
X is isolated from a diol dehydratase gene cluster. 

:The rernrnhinant mirmnrjani^nn nf.r.laim ?? v^^fpin IhP J fyrfTfrf dehydratase Q^ne 

-cluster jsirom-an organism selected from 1he g ener a consisting of Klebsiella and Citrobactor. 

25. . . The recombinant microorganism of Claim 23 wherein the diol dehydratase gene cluster 
. is from an organism-selected torn the genera .consisting oi KiebsieJla,w Clostridium and 

Salmonella. 

26. The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
heterologous to said microorganism. 

.27. . The recombinant .microorganism of Ciakn 19. wherein. said dehydratase activity is 
homologous to said microorganism. 

2B. - ~ The recombinant microorganism of Claim T9 : wherein the nucterc acid encoding protein 
X has the sequence as shown in SEQ ID NO: 59. 

29. The recombinant microorganism of Claim 20 wherein protein 1 has the sequencers 
shown in SEQ ID NO: 60 or SEQ ID NO: 61. 

3D; . : - The recombinant microorganism of Claim 20 wherein protein 2 has the sequence as 
shown in SEQ JD NO: 62 or SEQ ID MO: 63. 
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31. The recorhbinant of Claim 20 wherein protein 3 has the sequence as shown in SEQ ID: 
64 or SEQ ID NO: 65. 

1 32. A method for extending the half-life of dehydratase activity in a transformed 
" nrricrooTganisTn capable .of producing 1,3T|jjupanediul.anri containing at least one nucleic acid 
encoding a dehydratase activity, comprising the step of introducing a nucleic acid encoding 
protein X into said microorganism and cufturing under conditions suitable for production of 1,3- 
propanediol. - 

33. The method of Claim 32 wherein the nucleic acid encoding the dehydratase activity is 
heterologous to said microorganism. 

..34. - - "-"The method of Claim 32 wherein the nucteic-achj "encoding the dehydratase activity is 
-homologous to said microorganism. 

:35. .:: "The method of Claim 32 .wherein ihe^iucleic acid encoding protein X is isolated from a 
-glycerol ^dehydratase gene duster. 

36. The method of Claim 32 wherein the nucleic acid encoding protein X is isolated from a 
diol dehydratase gene cluster. 

37. The method of Claim 35 wherein the glycerol dehydratase gene cluster is from an 
organism selected from the genera consisting of Klebsiella and Citrobactor. 

38. The method of Claim 34 wherein the diol dehydiatase.gene cluster is :from 
selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

39. The method of Claim 32 wherein the microorganism is selected from the group 
consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter,. Lactobacillus, 

JkspergUius, Saccharomyces, Sdiizosaccharomyges^ZygQsacchBromyc&s.Piuhia. 
iduyverxmlyce^ Candida; Hahsenufa^Detiaryomyces, Muvor/Torufopsis, Methylobacter, 
Escherichia ;'Salmonella\ Bacilhis, Streptomyces and Pseudomonas. 

40. The method of Claim 32 further comprising the step of introducing at least one nucleic 
acid encoding protein 1, protein 2 or protein 3 into said microorganism. 
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WHAT IS CLAIMED IS: 

1. An improved method for the production of 1,3-propanediol from an organism capable of 
producing 1,3-propanediol, said organism comprising at least one gene encoding a 
dehydratase activity, the method comprising the steps of: 

(a) introducing a gene encoding protein X into the organism to create a transformed 
organism; and 

(b) culturing the transformed organism in the presence of at least one carbon source 
capable of being converted to 1,3 propanediol in said transformed host organism 
and under conditions suitable for the production of 1 ,3 propanediol wherein the 
carbon source is selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and a one carbon substrate. 

2. The method of Claim 1 further comprising the step of introducing at least one gene encoding 
a protein selected from the group consisting of protein 1, protein 2 and protein 3 into the 
organism. 

3. The method of Claim 1 further comprising the step of recovering the 1 ,3 propanediol. 

4. The method of Claim 1 wherein the gene encoding protein X is isolated from a glycerol 
dehydratase gene cluster. 

•» 

5. The method of Claim 1 wherein the gene encoding protein X is isolated from a diol 
dehydratase gene cluster. 



6. The method of Claim 4 wherein the glycerol dehydratase gene cluster is from an organism 
selected from the genera consisting of Klebsiella and Citrobactor. 

7. The method of Claim 5 wherein the diol dehydratase gene cluster is from an organism 
selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

8. The method of Claim 1 wherein the gene encoding a dehydratase activity is heterologous to 
the organism. 



9. The method of Claim 1 wherein the gene encoding a dehydratase activity is homologous to 
the organism. 
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10. The method of Claim 1 wherein the organism is selected from the group of genera 
consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, 
Aspergillus, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, 
Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, 
Escherichia, Salmonella, Bacillus, Streptomyces and Pseudomonas. 

11. The method of Claim 10 wherein the organism is selected from the group consisting of 
E.coli and Klebsiella spp. 

12. The method of Claim 1 wherein the gene encoding protein X is stably maintained in the host 
genome. 

13. The method of Claim 2 wherein at least one gene encoding a protein selected from protein 
1, protein 2 and protein 3 is stably maintained in the host genome. 

14. The method of Claim 1 wherein the carbon source is glucose. 

15. The method of Claim 1 wherein the gene encoding protein X has the sequence as shown in 
SEQ ID NO: 59. 

16. The method of Claim 2 wherein protein 1 has the sequence as shown in SEQ ID NO: 60 or 
SEQ ID NO: 61. 

17. The method of Claim 2 wherein protein 2 has the sequence as shown in SEQ ID NO: 62 or 
SEQ ID NO: 63. 

18. The method of Claim 2 wherein protein 3 has the sequence as shown in SEQ ID NO:64 or 
SEQ ID NO: 65. 

19. A recombinant microorganism capable of producing 1 ,3-propanediol from a carbon source 
said recombinant microorganism comprising a) at least one gene encoding a dehydratase 
activity; b) at least one gene encoding a glycerol-3-phosphatase; and c) at least one gene 
encoding protein X. 

20. The recombinant microorganism of Claim 19 further comprising d) at least one gene 
encoding a protein selected from the group consisting of protein 1 , protein 2 and protein 3. 
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21. The recombinant microorganism of Claim 19 selected from the group consisting of 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, 
Bacillus, Streptomyces and Pseudomonas. 

22. The recombinant microorganism of Claim 19 wherein the gene encoding protein X is 
isolated from a glycerol dehydratase gene cluster. 

23. The recombinant microorganism of Claim 19 wherein the gene encoding protein X is 
isolated from a diol dehydratase gene cluster. 

24. The recombinant microorganism of Claim 22 wherein the glycerol dehydratase gene cluster 
is from an organism selected from the genera consisting of Klebsiella and Citrobactor. 

25. The recombinant microorganism of Claim 23 wherein the diol dehydratase gene cluster is 
from an organism selected from the genera consisting of Klebsiella, Clostridium and 
Salmonella. 

26. The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
heterologous to said microorganism. 

27. The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
homologous to said microorganism. 

28. The recombinant microorganism of Claim 19 wherein the gene encoding protein X has the 
sequence as shown in SEQ ID NO: 59. 

29. The recombinant microorganism of Claim 20 wherein protein 1 has the sequence as shown 
in SEQ ID NO: 60 or SEQ ID NO: 61 . 

30. The method of Claim 20 wherein protein 2 has the sequence as shown in SEQ ID NO: 62 or 
SEQ ID NO: 63. 
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31 . The method of Claim 20 wherein protein 3 has the sequence as shown in SEQ ID: 64 or 
SEQ ID NO: 65. 

32. A method for extending the halflife of dehydratase activity in a microorganism capable of 
producing 1 ,3-propanediol and containing at least one gene encoding a dehydratase activity, 
comprising the step of introducing a gene encoding protein X into said microorganism and 
culturing under conditions suitable for production of 1,3-propanediol. 

33. The method of Claim 32 wherein the gene encoding the dehydratase activity is 
heterologous to said microorganism. 

34. The method of Claim 32 wherein the gene encoding the dehydratase activity is homologous 
to said microorganism. 

35. The microorganism of Claim 32 wherein the gene encoding protein X is isolated from a 
glycerol dehydratase gene cluster. ;. 

36. The microorganism of Claim 32 wherein the gene encoding protein X is isolated from a diol 
dehydratase gene cluster. 

37. The microorganism of Claim 35 wherein the glycerol dehydratase gene cluster is from an 
organism selected from the genera consisting of Klebsiella and Citrobactor. 

38. The microorganism of Claim 34 wherein the diol dehydratase gene cluster is from an 
organism selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

39. The method of Claim 32 wherein the microorganism is selected from the group consisting 
of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, 
Bacillus, Streptomyces and Pseudomonas. 

40. The method of Claim 32 further comprising the step of introducing a gene encoding at least 
one of protein 1, protein 2 and protein 3 into said microorganism. 
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WHAT IS CLAIMED IS: 

1. An improved method for the production of 1 ,3-propanediol from a microorganism 
comprising the steps of: 

a) obtaining a recombinant microorganism capable of producing 1,3- 
propanediol, said microorganism comprising at least one nucleic acid encoding a 
dehydratase activity and a nucleic acid encoding protein X; and 

b) culturing the recombinant microorganism in the presence of at least one 
carbon source capable of being converted to 1,3 propanediol in said transformed 
microorganism and under conditions suitable for the production of 1,3 propanediol 
wherein the carbon source is selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and a one carbon substrate. 



2. The method of Claim 1 wherein said recombinant microorganism comprises at least one 
nucleic acid encoding a protein selected from the group consisting of protein 1, protein 2 and 
protein 3. . 

* 

3. The method of Claim 1 further comprising the step of recovering the 1 ,3 propanediol. 

4. The method of Claim 1 wherein the nucleic acid encoding protein X is isolated from a 
glycerol dehydratase gene cluster. 

5. The method of Claim 1 wherein the nucleic acid encoding protein X is isolated from a 
diol dehydratase gene cluster. 

6. The method of Claim 4 wherein the glycerol dehydratase gene cluster is from an 
organism selected from the genera consisting of Klebsiella and Citrobactor. 

7. The method of Claim 5 wherein the diol dehydratase gene cluster is from an organism 
selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

8. The method of Claim 1 wherein the nucleic acid encoding a dehydratase activity is 
heterologous to the organism. 

9. The method of Claim 1 wherein the nucleic acid encoding a dehydratase activity js 
homologous to the organism. 
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10. The method of Claim 1 wherein the recombinant microorganism is selected from the 
group of genera consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, 
Lactobacillus, Aspergillus, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, 
Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, 
Escherichia, Salmonella, Bacillus, Streptomyces and Pseudomonas. 

1 1 . The method of Claim 1 0 wherein the microorganism is selected from the group 
consisting of E.coli and Klebsiella spp. 

12. The method of Claim 1 wherein the nucleic acid encoding protein X is stably maintained 
in the host genome. 

13. The method of Claim 2 wherein at least one nucleic acid encoding a protein selected 
from protein 1, protein 2 and protein 3 is stably maintained in the host genome. 

14. The method of Claim 1 wherein the carbon source is glucose. 

15. The method of Claim 1 wherein the nucleic acid encoding protein X has the sequence as 
shown in SEQ ID NO: 59. 

16. The method of Claim 2 wherein protein 1 has the sequence as shown in SEQ ID NO: 60 
or SEQ ID NO: 61. 

1 7. The method of Claim 2 wherein protein 2 has the sequence as shown in SEQ ID NO: 62 
or SEQ ID NO: 63. 

18. The method of Claim 2 wherein protein 3 has the sequence as shown in SEQ ID NO:64 
or SEQ ID NO: 65. 

19. A recombinant microorganism capable of producing 1 ,3-propanediol from a carbon 
source said recombinant microorganism comprising a) at least one nucleic acid encoding a 
dehydratase activity; b) at least one nucleic acid encoding a glycerol-3-phosphatase; and c) at 
least one nucleic acid encoding protein X. 

20. The recombinant microorganism of Claim 1 9 further comprising d) at least one nucleic 
acid encoding a protein selected from the group consisting of protein 1, protein 2 and protein 3. 
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21. The recombinant microorganism of Claim 19 selected from the group consisting of 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, 
Bacillus, Streptomyces and Pseudomonas. 

22. The recombinant microorganism of Claim 19 wherein the nucleic acid encoding protein 
X is isolated from a glycerol dehydratase gene duster. 

23. The recombinant microorganism of Claim 19 wherein the nucleic acid encoding protein 
X is isolated from a diol dehydratase gene cluster. 

24. The recombinant microorganism of Claim 22 wherein the glycerol dehydratase gene 
cluster is from an organism selected from the genera consisting of Klebsiella and Citrobactor. 

25. The recombinant microorganism of Claim 23 wherein the diol dehydratase gene cluster 
is from an organism selected from the genera consisting of Klebsiella, Clostridium and 
Salmonella. 

26. The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
heterologous to said microorganism. 

27. The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
homologous to said microorganism. 

28. The recombinant microorganism of Claim 19 wherein the nucleic acid encoding protein 
X has the sequence as shown in SEQ ID NO: 59. 

29. The recombinant microorganism of Claim 20 wherein protein 1 has the sequence as 
shown in SEQ ID NO: 60 or SEQ ID NO: 61 . 

30. The recombinant microorganism of Claim 20 wherein protein 2 has the sequence as 
shown in SEQ ID NO: 62 or SEQ ID NO: 63. 
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31 . The recombinant of Claim 20 wherein protein 3 has the sequence as shown in SEQ ID: 
64 or SEQ ID NO: 65. 

32. A method for extending the half-life of dehydratase activity in a transformed 
microorganism capable of producing 1,3-propanediol and containing at least one nucleic acid 
encoding a dehydratase activity, comprising the step of introducing a nucleic acid encoding 
protein X into said microorganism and culturing under conditions suitable for production of 1,3- 
propanediol. 

33. The method of Claim 32 wherein the nucleic acid encoding the dehydratase activity is 
heterologous to said microorganism. 

34. The method of Claim 32 wherein the nucleic acid encoding the dehydratase activity is 
homologous to said microorganism. 

35. The method of Claim 32 wherein the nucleic acid encoding protein X is isolated from a 
glycerol dehydratase gene cluster. 

36. The method of Claim 32 wherein the nucleic acid encoding protein X is isolated from a 
diol dehydratase gene cluster. 

37. The method of Claim 35 wherein the glycerol dehydratase gene cluster is from an 
organism selected from the genera consisting of Klebsiella and Citrobactor. 

38. The method of Claim 34 wherein the diol dehydratase gene cluster is from an organism 
selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

39. The method of Claim 32 wherein the microorganism is selected from the group 
consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, 
Aspergillus, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, 
Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, 
Escherichia, Salmonella, Bacillus, Streptomyces and Pseudomonas. 

40. The method of Claim 32 further comprising the step of introducing at least one nucleic 
acid encoding protein 1, protein 2 or protein 3 into said microorganism. 
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INTERNATIONAL PRELIMINARY EXAMINATION REPORT 
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under Article 1 4 are referred to in this reportas "originally filed" and are not annexed to the report since they do not contain annulments): 



□ 

m 



the international application as originally filed. 



the description, pages 1-100 



pages None 
pages NONE 
pages 



| x| the claims, 



Nos. 
Nos. 
Nos. 
Nos. 
Nos. 



None 



None 



NONE 



| x| the drawings, sheets/&g 



-27 



sheets/fig None 



sheets/fig NONE 
sheets/fig 



, as originally filed. 
, filed with the demand. 
, filed with the letter of . 
, filed with the letter of . 



, as originally filed. 

, as amended under Article 1 9. 

, filed with the demand. 

, filed with the letter of 

, filed with the letter of 



, as originally filed. 
, filed with the demand. 
, filed with the letter of 
, filed with the letter of . 



2. The amendments have resulted in the cancellation of: 

| x| tne description, pages None 

[ x| the claims, 



Nos. 



None 



| x| the drawings, sheets/fig None 



3. 



□ 



This report has been established as if (some of) the amendments had not been made, since they have been considered 
to go beyond the disclosure as filed, as indicated in the Supplemen tal-Box Additional observations below (Rule 70.2(c)). 



4. Additional observations, if necessary: 
NONE 
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V. Reas ned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citati nt and xplanationt supp rting such statement 



1. STATEMENT 

Novelty (N) Claims ±40 YES 

Claims None NO 

Inventive Step (IS) Claims MP YES 

Claims None NO 

Industrial Applicability (IA) Claims N4° YES 

Claims None NO 



2. CITATIONS AND EXPLANATIONS 



Claims 1-40 meet the criteria set out in PCT Article 33(2 H^), because the prior art does not teach or fairly suggest the 
compression of the nucleic acid sequence encoding protein X and diol or glycerol dehydratase in a microorganism to 
enhance the stability and the activity of the dehydratase in vivo. The transformed microorganism is more efficient in 
producing 1,3 -propanediol. 

NEW CITATIONS 

NONE 
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Supplem ntal B x 

(To be used when the spac in any of the preceding boxes is not sufficient) 



Continuation of: Boxes I - VIII Sheet 10 



CLASSIFICATION: 

The International Patent Classification (IPC) and/or the National classification are as listed below: IPC(6): C12P 
7/18; C12N 1/14. 1/20. 9/96 and US CL: 435/158, 188, 252.3, 252.31, 252.33, 252.34, 252.35 , 254.11. 254.2, 254.21, 
254.22, 254.23, 254.3 
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VmTENT cooperation tr 



From the INTERNATIONAL SEARCHIN G AUTHORITY 

I To: 



GENENCOR INTERNATIONAL, INC 

Attn. GLAISTER, D.J. 

925 Page Mill Road 

Palo Alto, California 94304-1013 

UNITED STATES OF AMERICA 



PCT 

jBOTIFlCAl ION OF TRANSMITTAL OF 

* mfiti nt e National search report 

.. OR THE DECLARATION 
(PCT flule -04.1) 



Applicant s or agent's file reference 

GC369-2-PCT 



International application No. 

PCT/US 97/20873 

Applicant 



Date of mailing 
(day/month/year) 



08/05/1998 



FOR FURTHER ACTION See paragraphs 1 and 4 below 



international filing date 



GENENCOR INTERNATIONAL INC, ert al . 



tkT?^ * amendments and statement imder Article 13 

Where? Directly to the International Bureau of WIPO 
34. chemin des Coiombettes 
121 1 Geneva 20. Switzerland ' 
Fascimile No.: (41-22) 740.14.35 

For more detailed instructions, see the notes on the accompanying sheet. 
2 □ InTcle^ Search Repon w,l be estah.-shed and that the dec.ara.on under 

3. □ wn^^to^protest n^P^vimmmomiw^i*****^*^^ 

□ no oecs.on has Peen made ye, on the protest: the appKcan, w ,„ Pe notmed as soon as a decision is made. 

4. Further actions): The applicant is reminded of the followmg: 
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nS^S TO FORM PCT/ISA/220 



These Notes are intended to give the basic instructions concerning the filing of amendments under article 19. The 
Motes are based on the requirements of the Patent Cooperation Treaty, the Regulations and the Administrative Instructions 
under that Treaty, in case of discrepancy between these Notes and those requirements, the latter are applicable For more 
detailed information, see also the PCT Applicant's Guide, a publication of W1PO. 

In these Notes, 'Article', 'Rule', and "Section* refer to the provisions of the PCT, the PCT Regulations and the PCT 

Administrative Instructions respectively. 



X^' INS J AUCTIONS CONCERNING AMENDMENTS UNDER ARTICLE 19 




The applicant has , after having received the international search report, one opportunity to amend the claims of the 
international application, ft should however be emphasized that, since ail parts of the international application (claims, 
description and drawings) may be amended during the international preliminary examination procedure, there is usually 
no need to file amendments of the claims under Article 19 except where, e g . the applicant wants the latter to be published 
for the purposes of provisional protection or has another reason for amending the claims before international pbulication. 
Furthermore, it should be emphasized that provisional protection is available in some States only. 



What parts of the International application may be amended? 

Under Article 19, only the claims may be amended. 

. . ~." During the u lie* i i alio nat phase* the claims nisy orscbe amended (or "furrher emended} under Article 34 before 

the International Preliminary Examining Authority The description and drawings may only be amended under 
■ -s- Artiste 34 before the international £xarninjng Authority. 

■-- Upon entry into the national phase, all parts of the international application may be amended under Article 26 
. or, where applicable. Article 41 

"^Whert? ^nftttrfn ? moi itlis h mi > the "date uf tranei > utrar of the TntewtaticHafrejaafchas p oftor^ 0 iiwji df is fro rmth e pnpn ty 

date, whichever time limit expires later. It should be noted, however, that the amendments will be considered 
as having been received on time if they are received .by the International Bureau after the expiration of the 
applicable time limit but before the completion of the technical preparations for international publication 
(Rule 46 1). 

Where not to file the amend mania? 

The amendments may only be filed with the International Bureau and not with the receiving Office or the 
Internationa! Searching Authority (Rule 46.2). 

~ . Where a demand for international preliminary examination has b ee n As filed, see below 

r *r*; .rEithei by .cancelling one*or more ei Una oraims, by aridmg oneor mpae Tiewciainis ia~>by"amenc>ng^fhe east of 
~. "one or more of the claims as filed. 

A replacement sheet must be submitted for each shee t of the ctarrna which , -on account of an amendment or . - 
amendments, differs from the sheet onginally filed. 

All the claims appearing on a replacement sheet must be numbered in Arabic numerals. Where a claim ta 
cancelled, no renumbering of the other claims is required. In aH cases where claims are renumbered, they must 
be renumbered consecutively (Administrative Instructions, Section 205(b)). 

The amendments must be made in the language In which the international application Is to be published. 



parry the i 
Letter {Section 205(b)): 

The amendments must be submitted with a letter. 

*ihs iettei wrfl ntgt be'puoashedentrv'the iHsamabonej appheatioTi and the amendedxlejrns . ttariaptdTaa^^a 

confused with the "Statement under Article 1 9(1)' (see below, under 'Statement under Article 19(1)*). 

The letter must be In English or French, at the choice of the applicant. However, tf the language of the 
International application Is English, the letter must be In English; M the language of the international application 
is French, the letter must be In French. 
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♦x #x 

NOTES TO FORM PCT/ISA/220 ( ntlnued) 



The letter must indicate the differences between the claims as filed and the claims as amended, it must, in 
particular, indicate, in connection with each claim appearing in the international application (it being understood 
that identical indications concerning several claims may be grouped) .whether 

(i) the claim is unchanged; 

(K) the claim is cancelled; 

Qu) the claim is new; 

" " (w) 'the ciarm rephrase one or more oiaims ma filed; 

^v) the ctarm is the tesu t l of the QjysMpn of m ctaim ssi fsad. 



The following examples illustrate the mann er m whic h 
ipanylng letter: 



amei i dnisrrts must bo explained hi ttie 



1 . [Where originally there were 46 cf arms and after amendment of some claims there are 51 ]: 

"Claims 1 to 29, 31 , 32, 34, 35, 37 to 46 replaced by amended claims bearing the same numbers; 
oiaims 30, 33 and 36 unchanged; new claims 49 to 51 added." 



2 



[Where originally there were 1 5 claims and after amendment of ail oiaims the 
."Claims 1 to 15 replaced by amended dai me 1 to 11 ." 



.11]: 



\ u * pnmere originally there 
;: • -new claims]: 

-X "'Claims 1 to 6 and 14 unchanged, claims 7 to 1 3 cancelled; new oiaims 1 5, 16 and 17. 
""Claims 7 to 13 cancelled, new claims 1 5, 16 and 1 7 added, ail other oiaims u nch ange d * 

k (Where various kinds of amendm ents are made]: 

.■Claims 1 -10 unchanged; claims 11 to 13, 18 and 19 cancelled, claims 14 t 15 and 16 replaced by 
. .alarm 14,claan i 7 subdivided into amended derma 15, 16and 17;r^tsiajms20and2Taoded " 



i and in adding 



"Statement under article 19(1)" (Rule 46.4) 

The amendments may be accompanied by a statement explaining the amendments and indicating any impact 
that such amendments might have on the description and the drawings (which cannot be amended under 
Article 19(1)). 

The statement will be published with the international application and the amended oiaims. 
It must be In the language in which the international ap p pHca t te n rs to be published. 

K must be brief, not exceeding 500 words if in English or if translated into English. 

It should not be confused with and does not replace the letter indicating the differences between the derma 
as filed and as amended, it must be filed on a separate sheet and must be id e n t ifie d as such by a heading, 
preferably by using the words "Statement under Article 1 9(1 ).* 

It may not contain any disparaging comments on the international search re po r t or the relevance of citations 
contained in that report. Reference io citations, relevant to a given daan, contained rntfae wssraatsanai aaardi 
report may be made only in connection. with an amendment of that claim 



- Consequence If a demand for international preliminary examination has already bean filed 

r, If, at the time of filing any amendments under Article 1 9, a demand tor international preliminary examination 

r - has already been submitted, the applicant must preferably, at the same time of filing the a men d ments with the 

- . j,- International Bureau, also file a oopy of such amendments with the International Preliminary Examining 

. Authority (see Rule 62.2(a), first sentence) 



ion of the satej paternal application Jut esrhy into the 

- "foe a ppH i ai ite~ a f hw rbm > is drawn to the fad thai , where upon entry into the national phase, a translation of the 
claims as amended under Article 19 may have to he furnished to the design at eoVelected Offices, instead of, or 
-maddition to, the tMnsOatxanof theoiarms-as Mad. 

"- ^or further details on t^eTeo^wr^ements e>ect> oJestgrtatetftelected Offkje.eee Arotama tT^MheyCT'^ppttcaJtla 

Guide. 
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ATENT COOPERATION TRE 

PCT 



INTERNATIONAL SEARCH REPORT 

(PCT Article 18 and Rules 43 and 44) 



Applicant s or agent s file reference 

GC369-2-PCT 


FOR FURTHER see Notification of Transmittal of International Search Report 

(Form PCT/ISA/220) as well as, where applicable, item 5 below. 

ACTION 


international application No. 

TCT/US" 97/ 20873 


interna tionaJ filing date (day /monthly ear) . 

13/11/1997 


(Earliest) Priority Date (day/montfi/year) 

13/11/1996 


Applicant 

GENENCOR INTERNATIONAL INC, et al . 



This international Search Report has been prepared by this International Searching Authority and is transmitted to.the applicant 
according to Article 1 8. A copy is being transmitted to the International Bureau. 



"This international Search Reportccnsisis of atotal of_ 



streets. 



fx] it is also accompanied by a copy o1 each priccart. document cited in this report. 



. 1 .Certain ciarnis were found unsearcftabie (see Box I). 

2. Q Unity ^ot invention is lacking (see Box II). 

3. [^j The international application contains disclosure of a nucleotide and/or amino acid sequence listing and the 

international search was carried out on the basis of the sequence listing 

|"x~| fifed with the international application. 

-T-" — . f ••{• T U T nis t H -*dby~the applicant "sepaTatety "from the international application. 

| | but not accompanied by a statement to the effect that it did not include 

matter going beyond the disclosure in the international application as tiled. 

... j^] Transcribed by this Authority 



-4. j Witti regard to the title. [^] the text is approved as submitted toy the applicant 

. .ri * iZJ tne text has t>e€li> established by this Authority to read as tollows: 



5. , With regard to the abstract 

• - • - • E 

□ 



the text ts approved as s ubmi t ted toy the applicant 

the text has been established, according to Rule 38.2(b). toy this Authority as it appears m 
Box III. The applicant may. within one month from the date of mailing of this International 
Search Report, submit comments to this Authority. 



6. The figure of the drawings to be published with the abstract is: 

Figure No. as suggested by the applicant: 

i J because the applicant failed to suggest a figure. • 
| | because this figure better characterizes the invention. 



fx~j None of the figures. 
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INT££L TIONAL SEARCH REPORT 

tional Application No 

T/US 97/20873 



TIONAL SEARCH REPORT 



A. CLASSIFICATION OF SUBJECT MATTER , 

IPC 6 C12N15/53 C12N15/55 C12N15/60 C12P7/18 C12N9/04 
C12N9/16 C12N9/88 

Acco rding to International Patent Classification (IPC) or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 

IPC 6 C12P 



Documentation serened other than mmimumdocumentation to the extent triaJsx^ documents .ase included in the fields searched 



Electronic aata.base consulted curing the international search (name of data base and : where practical, search terms used) 



C DOCUMENTS CONSIDERED TO BE RELEVANT 



Category '- 


Citation of document, with indication, where appropnate : of the relevant passages 


Relevant to claim No. 


p,x, ' 


'WO 9615796 A (BUTONT ;GINENC0R 1TJT (US); 


1-40 


L 


. - LAFFEND LISA ANNE (US); NAGARAJAN VASA) 14 






- November "199.6 






see the whole document 






see . abstract 






see examples 2-5 






see- examples 22,23 






see page 62, paragraph 2 




P,A 


WO 96 35795 A (DU PONT ; NAGARAJAN VASANTHA 


1-13, 


(US); NAKAMURA CHARLES EDWIN (US)) 14 : 


15-40 




November 1996 






- see. abstract 






see page 9 






see examples 1-3 






-/- 





m 



Further documents are listed in the continuation of box C. 



Patent family members are listed in annex. 



c Special categories of cited documents : 

"A" document defining the general state of the art which is not 

considered to be of particular relevance 
"£' ! earlier document but published on or after the international 
.tiling date 

"V "document wri ich may mrow doubts. ott priority ctarm(s) ar~ 
which is cited to establish the publication date of anptner 
• • citation or otheT special reason {as specified) 

XT 'document referring to an oral disclosure, use, exhibition or 
other means 

V document published prior to the international tiling date-but 
tBtpr than The prroTtty. dale claimed 



T"- later document published after the international filing date 
or priority date and not in conflict with the application but 
■ cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
. -j: - cannot t» considered novel or cannot be considered to 
$ - involve an inventive step when the document is taken alone 

U Y" document of particular relevance; the claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

."ft" document TnembeT oi the same patent famrty 



Date oi the actual completion of themtemational search 

21 April 1998 


Date of mailing of the international search report 

08/05/1998 


Name and mailing address of the ISA 

European Patent Office. P.B. 5616 Patentlaan 2 
NL - 2280 HV Rijswijk 
Tel {-31-70) 340-2040. Tx. 31 651 epo nl. 
Fax: (*3i-70> 340-3016 


Authonzed officer 

Lejeune, R 
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Itlonal Application No 

PCT/US 97/20873 



C.(Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category s 


Citation of document, wrth indication. where appropnate, of the relevant passages 


Relevant to claim No. 


A 


TADTMATCII T C"T Al . "MAlornl ar r l An { nn 

lUBlrlAloU 1. tl AL . . nOieCUIar Cloning, 


1 ts 7 Q 




sequencing, ana expression ot tne genes 


in ii 




encouiny aucnuoy i uoija i am i n ucpcnucni uiui 






aenydrase ot mcds ic i la oxyuoca . 






1AIIDMAI AC DTAI nr TTAI PUCMTCTDV 

JUUKNAL Ur DlULUblLAL LrltnlblKT, 


>. 




. vol. 270, no. . 13, .31 March 1995, 






pages 7142-7148, XP002062740 




-*■ 


:. ci Leu in L-iie application 






see page /i4/, column paragrapn z 






r /\ A fs "S as as 7 1 ^ C A*» A 1 1 1 m ^ 1 1 ^ W\ AN 1 « — i At A 

see page /i4d, column i, nne i ~ page 


+- 




71 4fi ml nmn 1 lino 4 

/mo, v.u i UMin j., i i lie ** 






see aDSuracL 




A 

A 


TnDTMATCII T CT Al "Tl AnS rtrt r A/iiiAnrvnn 

lubiriAibU I. ti al . : 1 1 om ng , sequencing 


1 A A D 

1,4,0,0, 




1 Art* U 4 A]k 1 At! A 1 AS \j rs AS A" r* -«i AS Vk an ^ ^Ka aia«%aa 

ana mgn level expression ot tne genes 


in 1 1 1C 
1U , X I , It) 




encodi ng adenosyl cobal ami n-dependent 






A 1 \ JAS A V> A 1 WasWn\/Wv"*-Si~AS AS ^ ^/ 1 AS Vs ^ S AS 1 1 *N 

. g t ycero i oenyura^e . ot. k iepsie j * a 






■ ■ ■ nnBiTmnrii an " 

, -pneumuTi i ae . 






JOURNAL J)£.3IDLD£1CAL .CHEHISTRY , 






vol . 271; no. 37 v 13 September 1996, 






.... pages .z^oox zzjd/ , AruuzuD/y^o 






.--..see iubxrdcx ; 






cofl' riano" ^^^^A - - r*r\*T i im n"~1 " 1 ^ - lino Ts 

see page z^jdo, cu i utttti i, i-iTie £ i ine d 






r as as ft ana . y*51C^r - -y rs 1 ■ mif* O a a rth Vl — 

- — 5-cr 4>gyPi e c ,ir>13, UP JUIIin fioT agLapn 






page ccoo/, column 1, paragrapn i 




A 

A 


TAMP T T CT Al - "1 "J-ArrtnnnftHi r\1 

i uiMb i.i. ti al . . i , o propanea i o i 


1 —/I A Q 
1-4,0,5, 




production by Escherichia coli expressing 


10,11 




genes from the Klebsiella pneumoniae dha 






W AS Al 1 1 A A " 

regu i on . 






appi Trn nwn rw\/TpnwMrMTAi mtpdarthi nrv 

Mr rLltU MIMU t i\l V 1 KUIMrlt N 1 ML nl uKUD 1 ULUb T , 






U a1 C7 1O01 






pages Jb41-Jb4o, XrUu2ubZ/4± 






see abstract 




A 


. btYrtwtu n. xi al. : Lioning, sequencing, 


1 —A A 
1-4,0 


* 


ana overexpression OT.xne genes encoaing 






coenzyme B12-dependent glycerol 






A<l AS K t <A^ V> A ^ -A A> AS AS ^ P i ^ V« A A a4" AS *^ ^ AS 1 1 #S aJ 4 4 " 

, aenyaraiase ot LixroDacuer treunaii. 






. : JOURNAL OF BACTERIOLOGY, 






.vol. 178, no. 19, October 1996, 






..'pages 5793-5796, XP002062742 






cited in the application 






... see page 5794.; figure 1 






^ ^see ~page^5793^ 4>ara§T3pti .5 






* .see. abstr act 
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aonal Application No 

fcT/US 97/20873 



Patent document 
cited in search report 


Publication 
date 


Patent family 
member(s) 


Publication 
date 


WO 9635796 A 


14-11-96 


us 


. 5686276 


A 


11-11-97 






AU 


5678996 


A 


29-11-96 






EP 


0826057 


A 


04-03-98 



W0 9635795 A 14-11-96 US .5633362 A 27-05-97 

AU. 5722996 A 29-11-96 
■•. .. . . - . r £P . DB27543 A .11-03-98 
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VERSION* 
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WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12N 15/53, 15/55, 15/60, C12P 7/18, 
C12N 9/04, 9/16, 9/88 



A3 



(11) International Publican* n Number: 
(43) International Publication Date: 



WO 98/21341 

22 May 1998 (22.05.98) 



(21) International Application Number: PCT/US97/20873 

(22) International Filing Date: 13 November 1997 (13.1 1.97) 



(30) Pri rity Data: 

60/030,601 



1 3 November 1 996 ( 1 3. 1 1 .96) US 



(63) Related by Continuation (CON) or Continuation-in-Part 
(CEP) to Earlier Application 

US * 60/030,601 (CIP) 

Filed on Not furnished 



(71) Applicant (for all designated States except US): GENENCOR 

INTERNATIONAL, INC. [US/US]; 4 Cambridge Place, 
1870 South Winton Road, Rochester, NY 14618 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): DUNN-COLEMAN, 
Nigel, S. [GB/US]; 142 Johnson Avenue, Los Gatos, CA 
95032 (US). DIAZ-TORRES, Maria [ES/US]; 58 North 
El Camino Real, San Mateo, CA 94401 (US). CHASE, 
Matthew, W. [US/US]; 2211-27 Hastings Drive, Belmont, 
CA 94002 (US). TRIMBUR, Donald [US/US]; 349 Orchard 
Avenue, Redwood City, CA 94601 (US). 



(74) Agent: GLAISTER, Debra, J.; Genencor International, Inc., 
Page Mill Road, Palo Alto, CA 94304-1013 (US). 



(81) Designated States: AL, AM, AT, AU, AZ, BB, BG, BR, BY, 
CA, CH, CN, CZ, DE, DK, EE, ES, FI, GB, GE, HU, 
IS, JP, KE, KG. KP, KR, KZ, LK, LR, LS, LT, LU, LV. 
MD, MG, MK, MN, MW, MX, NO. NZ, PL, PT, RO, RU, 
SD, SE, SG, SI, SK, TJ, TM, TR, TT, UA, UG, US, UZ, 
VN, ARIPO patent (GH, KE, LS, MW, SD, SZ, UG, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, CH, DE, DK, ES, FI, FR, GB, 
GR, IE, IT, LU, MC, NL, PT, SE), OAPI patent (BF, BJ, 
CF, CG, CI, CM, GA, GN, ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 
With amended claims. 

(88)Date of publication of the intemationl search report: 

25 June 1998 (25.06.98) 



Date of publication of the amended claims: 



30 July 1998 (30.07.98) 



(54) Title: METHOD FOR THE RECOMBINANT PRODUCTION OF 1,3-PROPANEDIOL 
(57) Abstract 

The present invention provides an improved method for the production of 1 ,3-propanediol from a variety of carbon sources is an 
organism comprising DNA encoding protein X of a dehydratase or protein X in combination with at least one of protein 1, protein 2 and 
protein 3. The protein X may be isolated from a diol dehydratase or a glycerol dehydratase. The present invention also provides host cells 
comprising protein X that are capable of increased production of 1,3-propanediol. 



•(Referred to in PCT Gazette No. 38/1998, Section II) 
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METHOD FOR THE RECOMBINANT 



PRODUCTION OF 1,3-PROPANEDIOL 



Related Applications 

The present application is a continuation-in-part application of United States Provisional 
Application 60/030,601 filed November 13, 1996, hereby incorporated herein in its entirety. 
Field of Invention 

The present invention relates to the field of molecular biology and specifically to improved 
methods for the production of 1,3-propanediol in host cells. In particular, the present invention 
describes components of gene clusters associated with 1,3-propanediol production in host cells, 
including protein X, and protein 1 , protein 2 and protein 3. More specifically the present invention 
describes the expression of cloned genes encoding protein X , protein 1, protein 2 and protein 3, 
either separately or together, for the enhanced production of 1 ,3-propanedio! in host cells. 
Background 

1,3-Propanediol is a monomer having potential utility in the production of polyester fibers 
and the manufacture of polyurethanes and cyclic compounds. 

A variety of chemical routes to 1,3-propanediol are known. For example ethylene oxide 
may be converted to 1 ,3-propanediol over a catalyst in the presence of phosphine, water, carbon 
monoxide, hydrogen and an acid, by the catalytic solution phase hydration of acrolein followed by 
reduction, or from hydrocarbons such as glycerol, reacted in the presence of carbon monoxide 
and hydrogen over catalysts having atoms from group VIII of the periodic table. Although it is 
possible to generate 1 ,3-propanediol by these methods, they are expensive and generate waste 
streams containing environmental pollutants. 

It has been known for over a century that 1,3-propanediol can be produced from the 
fermentation of glycerol. Bacterial strains able to produce 1,3-propanediol have been found, for 
example, in the groups Citrobacter, Clostridium, Enterobacter, flyobacter, Klebsiella, 
Lactobacillus, and Pelobacter. In each case studied, glycerol is converted to 1,3-propanediol in a 
two step, enzyme catalyzed reaction sequence. In the first step, a dehydratase catalyzes the 
conversion of glycerol to 3-hydroxypropionaldehyde (3-HP) and water (Equation 1). In the second 
step, 3-HP is reduced to 1,3-propanediol by a NAD + -linked oxidoreductase (Equation 2). 



The 1,3-propanediol is not metabolized further and, as a result, accumulates in high concentration 
in the media. The overall reaction consumes a reducing equivalent in the form of a cofactor, 
reduced b-nicotinamide adenine dinucleotide (NADH), which is oxidized to nicotinamide adenine 
dinucleotide (NAD + ). 



Glycerol ® 3-HP + H 2 0 
3-HP + NADH + H + ® 1,3-Propanediol + NAD + 



(Equation 1) 
(Equation 2) 
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The production of 1,3-propanediol from glycerol is generally performed under anaerobic 
conditions using glycerol as the sole carbon source and in the absence of other exogenous 
reducing equivalent acceptors. Under these conditions, in for example, strains of Citrobacter, 
Clostridium, and Klebsiella, a parallel pathway for glycerol operates which first involves oxidation 
of glycerol to dihydroxyacetone (DHA) by a NAD + - (or NADP+-) linked glycerol dehydrogenase 
(Equation 3). The DHA, following phosphorylation to dihydroxyacetone phosphate (DHAP) by a 
DHA kinase (Equation 4), becomes available for biosynthesis and for supporting ATP generation 
via, for example, glycolysis. 



In contrast to the 1 ,3-propanediol pathway, this pathway may provide carbon and energy to the 
cell and produces rather than consumes NADH. 

In Klebsiella pneumoniae and Citrobacter freundii, the genes encoding the functionally 
linked activities of glycerol dehydratase (dhaB), 1,3-propanediol oxidoreductase (dnaT), glycerol 
dehydrogenase {dhaD), and dihydroxyacetone kinase (dhaK) are encompassed by the dha 
regulon. The dha regulons from Citrobacter and Klebsiella have been expressed in Escherichia 
co// and have been shown to convert glycerol to 1,3-propanediol. Glycerol dehydratase (E.C. 
4.2.1.30) and diol [1,2-propanediol] dehydratase (E C. 4.2.1.28) are related but distinct enzymes 
that are encoded by distinct genes. In Salmonella typhimurium and Klebsiella pneumoniae, diol 
dehydratase is associated with the pdu operon, see Bobik et al., 1992, J. Bacteriol. 174:2253-2266 
and United States patent 5,633,362. Tobimatsu, et al., 1996, J. Biol. Chem. 271: 22352-22357 
disclose the K. pneumoniae gene encoding glycerol dehydratase protein X identified as ORF 4; 
Segfried et al., 1996, J. Bacteriol. 178: 5793-5796 disclose the C. freundii glycerol dehydratase 
gene encoding protein X identified as ORF Z. Tobimatsu et al., 1995, J. Biol. Chem. 270:7142- 
7148 disclose the diol dehydratase submits a, p and y and illustrate the presence of orf 4. Luers 
(1997, FEMS Microbiology Letters 154:337-345) disclose the amino acid sequence of protein 1, 
protein 2 and protein 3 of Clostridium pasteurianum. 

Biological processes for the preparation of glycerol are known. The overwhelming 
majority of glycerol producers are yeasts, but some bacteria, other fungi and algae are also 
known to produce glycerol. Both bacteria and yeasts produce glycerol by converting glucose or 
other carbohydrates through the fructose-1 ,6-bisphosphate pathway in glycolysis or by the 
Embden Meyerhof Parnas pathway, whereas, certain algae convert dissolved carbon dioxide or 
bicarbonate in the chloroplasts into the 3-carbon intermediates of the Calvin cycle. In a series of 
steps, the 3-carbon intermediate, phosphoglyceric acid, is converted to glyceraldehyde 
3-phosphate which can be readily interconverted to its keto isomer dihydroxyacetone phosphate 
and ultimately to glycerol. 



Glycerol + NAD + ® DHA + NADH + H + 
DHA + ATP ® DHAP + ADP 



(Equation 3) 
(Equation 4) 
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Specifically, the bacteria Bacillus licheniformis and Lactobacillus lycopersica synthesize 
glycerol, and glycerol production is found in the halotolerant algae Dunaliella sp. and Asteromonas 
gracilis for protection against high external salt concentrations (Ben-Amotz et al., Experientia 38, 
49-52, (1982)). Similarly, various osmotolerant yeasts synthesize glycerol as a protective 
measure. Most strains of Saccharomyces produce some glycerol during alcoholic fermentation, 
and this can be increased physiologically by the application of osmotic stress (Albertyn et al. , Moi 
Cell. Biol. 14, 4135-4144, (1994)). Earlier this century commercial glycerol production was 
achieved by the use of Saccharomyces cultures to which "steering reagents" were added such a* 
sulfites or alkalis. Through the formation of an inactive complex, the steering agents block or 
inhibit the conversion of acetaldehyde to ethanol; thus, excess reducing equivalents (NADH) are 
available to or "steered" towards DHAP for reduction to produce glycerol. This method is limited 
by the partial inhibition of yeast growth that is due to the sulfites. This limitation can be partially 
overcome by the use of alkalis which create excess NADH equivalents by a different mechanism. 
In this practice, the alkalis initiated a Cannizarro disproportionate to yield ethanol and acetic acid 
from two equivalents of acetaldehyde. 

The gene encoding glycerol-3-phosphate dehydrogenase (DAR1 , GPD1) has been cloned 
and sequenced from S. diastaticus (Wang et al., J. Bact 176, 7091-7095, (1994)). The DAR1 
gene was cloned into a shuttle vector and used to transform E. coli where expression produced 
active enzyme. Wang et al. (supra) recognize that DAR1 is regulated by the cellular osmotic 
environment but do not suggest how the gene might be used to enhance 1 ,3-propanediol 
production in a recombinant organism. 

Other glycerol-3-phosphate dehydrogenase enzymes have been isolated: for example, 
sn-glycerol-3-phosphate dehydrogenase has been cloned and sequenced from S. cerevisiae 
(Larason et a!., Mot. Microbiol. 10, 1101, (1993)) and Albertyn et al., (Moi Cell. Biol. 14, 4135, 
(1994)) teach the cloning of GPD1 encoding a glycerol-3-phosphate dehydrogenase from 
S. cerevisiae. Like Wang et al. (supra), both Albertyn et al. and Larason et al. recognize the 
osmo-sensitivity of the regulation of this gene but do not suggest how the gene might be used in 
the production of 1 ,3-propanediol in a recombinant organism. 

As with G3PDH, glycerol-3-phosphatase has been isolated from Saccharomyces 
cerevisiae and the protein identified as being encoded by the GPP1 and GPP2 genes (Norbeck et 
al., J. Biol. Chem. 271, 13875,(1996)). Like the genes encoding G3PDH, it appears that GPP2 is 
osmosensitive. 

Although biological methods of both glycerol and 1 ,3-propanediol production are known, it 
has never been demonstrated that the entire process can be accomplished by a single 
recombinant organism. 

Neither the chemical nor biological methods described above for the production of 
1 ,3-propanediol are well suited for industrial scale production since the chemical processes are 
energy intensive and the biological processes require the expensive starting material, glycerol. A 
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method requiring low energy input and an inexpensive starting material is needed. A more 
desirable process would incorporate a microorganism that would have the ability to convert basic 
carbon sources such as carbohydrates or sugars to the desired 1 ,3-propanediol end-product. 

Although a single organism conversion of fermentable carbon source other than glycerol 
or dihydroxyacetone to 1 t 3-propanediol would be desirable, it has been documented that there are 
significant difficulties to overcome in such an endeavor For example, Gottschalk et ai. (EP 373 
230) teach that the growth of most strains useful for the production of 1 ,3-propanediol, including 
Citrobacter freundii, Clostridium autobutylicum, Clostridium butylicum, and Klebsiella pneumoniae, 
is disturbed by the presence of a hydrogen donor such as fructose or glucose. Strains of 
Lactobacillus brevis and Lactobacillus buchner, which produce 1,3-propanediol in co- 
fermentations of glycerol and fructose or glucose, do not grow when glycerol is provided as the 
sole carbon source, and, although it has been shown that resting cells can metabolize glucose or 
fructose, they do not produce 1,3-propanediol. (Veiga DA Cunha et al., J. Bacterid 174, 1013 
(1992)). Similarly, it has been shown that a strain of llyobacter polytropus, which produces 
1,3-propanediol when glycerol and acetate are provided, will not produce 1,3-propanediol from 
carbon substrates other than glycerol, including fructose and glucose. (Steib et al., Arch. 
Microbiol. 140, 139 (1984)). Finally Tong et al. (Appl. Biochem. Biotech. 34, 149 (1992)) has 
taught that recombinant Escherichia coli transformed with the dha regulon encoding glycerol 
dehydratase does not produce 1,3-propanediol from either glucose or xylose in the absence of 
exogenous glycerol. 

Attempts to improve the yield of 1 ,3-propanediol from glycerol have been reported where 
co-substrates capable of providing reducing equivalents, typically fermentable sugars, are 
included in the process. Improvements in yield have been claimed for resting cells of Citrobacter 
freundii and Klebsiella pneumoniae DSM 4270 cofermenting glycerol and glucose (Gottschalk et 
al., supra., and Tran-Dinh et al., DE 3734 764); but not for growing cells of Klebsiella pneumoniae 
ATCC 25955 cofermenting glycerol and glucose, which produced no 1,3-propanediol (l-T. Tong, 
Ph.D. Thesis, University of Wisconsin-Madison (1992)). Increased yields have been reported for 
the cofermentation of glycerol and glucose or fructose by a recombinant Escherichia coli, 
however, no 1,3-propanediol is produced in the absence of glycerol (Tong et al., supra.). In these 
systems, single organisms use the carbohydrate as a source of generating NADH while providing 
energy and carbon for cell maintenance or growth. These disclosures suggest that sugars do not 
enter the carbon stream that produces 1,3-propanediol. In no case is 1,3-propanediol produced in 
the absence of an exogenous source of glycerol. Thus the weight of literature clearly suggests 
that the production of 1,3-propanediol from a carbohydrate source by a single organism is not 
possible. 
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The weight of literature regarding the role of protein X in 1,3-propanediol production by a 
host cell is at best confusing. Prior to the availability of gene information, McGee et al., 1982, 
Biochem. Biophys. Res. Comm. 108: 547-551, reported diol dehydratase from K. pneumoniae 
ATCC 8724 to be composed of four subunits identified by size (60K, 51 K, 29K, and 15K daltons) 
and N-terminal amino acid sequence. In direct contrast to MeGee, Tobimatsu et al.1995, supra, 
report the cloning, sequencing and expression of diol dehydratase from the same organism and 
find no evidence linking the 51 K dalton polypeptide to dehydrase. Tobimatsu et al.1996, supra, 
conclude that the protein X polypeptide is not a subunit of glycerol dehydratase, in contrast to 
GenBank Accession Number U30903 where protein X is described as a large subunit of glycerol 
dehydratase. Seyfried et al., supra, report that a deletion of 192 bp from the 3' end of orfZ 
(protein X) had no effect on enzyme activity and conclude that orfZ does not encode a subunit 
required for dehydratase activity. Finally, Skraly, F.A. (1997, Thesis entitiled "Metabolic 
Engineering of an Improved 1,3-Propanediol Fermentation") disclose a loss of glycerol 
dehydratase activity in one experiment where recombinant ORF3 (proteinX) was disrupted 
creating a large fusion protein but not in another experiment where 1,3-propanediol production 
from glycerol was diminished compared to a control where ORF3 was intact. 

The problem to be solved by the present inyention is the biological production of 
1,3-propanediol by a single recombinant organism from an inexpensive carbon substrate such as 
glucose or other sugars in commercially feasible quantities. The biological production of 
1,3-propanediol requires glycerol as a substrate for a two step sequential reaction in which a 
dehydratase enzyme (typically a coenzyme B-|2" de P endent dehydratase) converts glycerol to an 
intermediate, 3-hydroxypropionaldehyde, which is then reduced to 1,3-propanediol by a NADH- 
(or NADPH) dependent oxidoreductase. The complexity of the cofactor requirements 
necessitates the use of a whole cell catalyst for an industrial process which utilizes this reaction 
sequence for the production of 1 ,3-propanediol. Furthermore, in order to make the process 
economically viable, a less expensive feedstock than glycerol or dihydroxyacetone is needed and 
high production levels are desirable. Glucose and other carbohydrates are suitable substrates, 
but, as discussed above, are known to interfere with 1,3-propanediol production. As a result no 
single organism has been shown to convert glucose to 1,3-propanediol. 

Applicants have solved the stated problem and the present invention provides for 
bioconverting a fermentable carbon source directly to 1,3-propanediol using a single recombinant 
organism. Glucose is used as a model substrate and the bioconversion is applicable to any 
existing microorganism. Microorganisms harboring the genes encoding protein X and protein 1, 
protein 2 and protein 3 in addition to other proteins associated with the production of 1,3- 
propanediol, are able to convert glucose and other sugars through the glycerol degradation 
pathway to 1,3-propanediol with good yields and selectivities. Furthermore, the present invention 
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may be generally applied to include any carbon substrate that is readily converted to 1) glycerol, 

2) dihydroxyacetone, or 3) C3 compounds at the oxidation state of glycerol (e.g., glycerol 

3-phosphate) or 4) C3 compounds at the oxidation state of dihydroxyacetone (e.g., 

dihydroxyacetone phosphate or glyceraldehyde 3-phosphate). 

Summary of the Invention 

The present invention relates to improved methods for the production of 1,3-propanediol 

from a single microorganism. The present invention is based, in part, upon the unexpected 

discovery that the presence of a gene encoding protein X in a microorganism containing at least-' 

one gene encoding a dehydratase activity and capable of producing 1,3-propanediol is associated 

with the in vivo reactivation of dehydratase activity and increased production of 1,3-propanediol in 

the microorganism. The present invention is also based, in part, upon the unexpected discovery 

that the presence of a gene encoding protein X and at least one gene encoding a protein selected 

from the group consisting of protein 1, protein 2 and protein 3 in host cells containing at least one 

gene encoding a dehydratase activity and capable of producing 1,3-propanediol is associated with 

in vivo reactivation of the dehydratase activity and increased yields of 1,3-propanediol in the 

microorganism. 

Accordingly, the present invention provides an improved method for the production of 1,3- 
propanediol from a microorganism capable of producing 1,3-propanediol, said microorganism 
comprising at least one gene encoding a dehydratase activity, the method comprising the steps of 
introducing a gene encoding protein X into the organism to create a transformed organism; and 
culturing the transformed organism in the presence of at least one carbon source capable of 
being converted to 1,3 propanediol in said transformed host organism and under conditions 
suitable for the production of 1 ,3 propanediol wherein the carbon source is selected from the 
group consisting of monosaccharides, oligosaccharides, polysaccharides, and a one carbon 
substrate. 

In a preferred embodiment, the method for improved production of 1,3-propanediol 
further comprises introducing at least one gene encoding a protein selected from the group 
consisting of protein 1, protein 2 and protein 3 into the organism. The microorganism may further 
comprise at least one of (a) a gene encoding a glycerol-3-phosphate dehydrogenase activity; (b) a 
gene encoding a glycerol-3-phosphatase activity; and (c) a gene encoding 1,3-propanediol 
oxidoreductase activity into the microorganism. Gene(s) encoding a dehydratase activity, protein 
X, proteins 1, 2 or 3 or other genes necessary for the production of 1,3-propanediol may be stably 
maintained in the host cell genome or may be on replicating plasmids residing in the host 
microorganism. 

The method optionally comprises the step of recovering the 1,3 propanediol. In one 
aspect of the present invention, the carbon source is glucose. 
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The microorganism is selected from the group of genera consisting of Citrobacter, 
Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, Saccharomyces, 
Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, Hansenula, 
Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, Bacillus, 
Streptomyces and Pseudomonas. 

In one aspect, protein X is derived from a glyceol dehydratase gene cluster and in another 
aspect, protein X is derived from a diol dehydratase gene cluster. The gene encoding the 
dehydratase activity may be homologous to the microorganism or heterologous to the 
microorganism. In one embodiment, the glycerol dehydratase gene cluster is derived from an 
organism selected from the genera consisting of Klebsiella and Citrobactor. In another 
embodiment, the diol dehydratase gene cluster is derived from an organism selected from the 
genera consisting of Klebsiella, Clostridium and Salmonella. 

In another aspect, the present invention provides a recombinant microorganism 
comprising at least one gene encoding a dehydratase activity; at least one gene encoding a 
glycerol-3-phosphatase; and at least one gene encoding protein X, wherein said microorganism is 
capable of producing 1,3-propanedio! from a carbon source. The carbon source may be selected 
from the group consisting of monosaccharides, oligosaccharides, polysaccharides, and a one 
carbon substrate. In a further embodiment, the microorganism further comprises a gene 
encoding a cytosolic glycerol-3-phosphate dehydrogenase. In another embodiment, the 
recombinant microorganism further comprises at least one gene encoding a protein selected from 
the group consisting of protein 1, protein 2 and protein 3. The microorganism is selected from the 
group consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, 
Aspergillus, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, 
Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, 
Escherichia, Salmonella, Bacillus, Streptomyces and Pseudomonas. In one aspect, protein X is 
derived from a glycerol dehydratase gene cluster. In another aspect, protein X is derived from a 
diol dehydratase gene cluster. In one aspect, the dehydratase activity is heterologous to said 
microorganism and in another aspect, the dehydratase activity is homologous to said 
microorganism. 

The present invention also provides a method for the in vivo reactivation of a dehydratase 
activity in a microorganism capable of producing 1,3-propanediol and containing at least one 
gene encoding a dehydratase activity, comprising the step of introducing a gene encoding protein 
X into said microorganism. The microorganism is selected from the group consisting of 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
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""ZZ ^^encoding - e -*» is heteroiogous ,o S3, 

!„d in anoLr aspect ft. 9 ene .needing .he dehydratase activity is homologous 

leeroi dehydls. gene duster and in ano,her embodiment ,he gene eneod,ng pro.etn X 

host cell lacKmg 2 and 3 and prot ein X is able to produce signif.canty more 1 ,3- 

n ea eTproduCion o, , propanediol in the hos, ee» over ^propanediol me 
h os, cell lacing nucleic acid encoding pro,e,n X aiong wi.h nucle,c ac,d ancod,ng a, leas, one 

Pr °" n - «he method o, produdon o. ,e presen, invent as shown 

„ .he JlCLon o. L dehydraiase ac,ivi.y in a microorganism ,ha. is assooia.ed w,.h .he 
presence of nucleic acid encoding protein X in .he microorgan.sm. 

p_se on plasmid pHK28-26 (SEQ ,D NO:t9,. in this f.gure. orfV encodes protetn ,, on. 
encodes protein 2 and orfW encodes protein 3. DhaB-X refers .o pro.e,n X^ 

Figures 2A-2G iiius.ra.es ,he nucleotide and amino ac,d sequence of Kfe6s,e»a 
. pneumonia, glycerol dehydratase protein X <dhab4> (SEQ ID NO:59). 

Figure 3 illus.ra.es .he amino acid alignment of K,eos,e»a pneumonra protetn 1 (SEQ 
NO: 61, and Md. H— P".«e.n, (SEQ ID NO. 60, (designated in Figure I 3 as ^ 
Flg ure 4 il,us,ra,es the amino acid alignment o, K«eos/e»a pneumonra prote.n 2 SEQ 

NO 63) and CM^Hrt P-* 2 < SEQ ID N& « T*^ " „ 
35 Figure 5 illustrates the amino acid alignment o. K,ebs,e„a p_a prote n SEQ ID 

NO 64, and Md. •»« l»- 3 (SEQ ID NO: 65, (designated ,n Ftgure 5 as orfW). 
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Figure 6 illustrates the in situ reactivation comparison of plasmids pHK28-26 (which 
contains dhaB subunits 1, 2 and 3 as well as protein X and the open reading frames encoding 
protein 1, protein 2 and protein 3) vs. pDT24 (which contains dhaB subunits 1 , 2 and 3 as well as 

protein X) in E.coli DH5a cells. 

Figure 7 illustrate the in situ reactivation comparison of plasmids pM7 (containing genes 
encoding dhaB subunits 1, 2 and 3 and protein X) vs. Plasmid pM11 (containing genes encoding 
dhaB subunits 1 , 2 and 3) in E.coli DH5a cells. 

Figures 8A-8E illustrates the nucleic acid (SEQ ID NO: 66) and amino acid (SEQ ID 
NO: 67) sequence of K. pneumoniae diol dehydratase gene cluster protein X. 

Figure 9 illustrates a standard 10 liter fermentation for 1,3 propandiol production 
using E. coli FM5/pDT24 (FM5 described in Amgen patent US 5,494,816 , ATCC accession 
No. 53911). 

Figure 10 illustrates a standard 10 liter fermentation for 1,3 propandiol production 
using E. coli DH5alpha/pHK28-26. 

Brief Description of Biolooical Dep osits and Sequence Listing 

The transformed E. coli W2042 (comprising the E. co//host W1485 and plasmids pDT20 

and P AH42) containing the genes encoding glycerol-3-phosphate dehydrogenase (G3PDH) and 

glycerol-3-phosphatase (G3P phosphatase), glycerol dehydratase (dhaB), and 1 ,3-propanediol 

oxidoreductase (dhaT) was deposited on 26 September 1996 with the ATCC under the terms of 

the Budapest Treaty on the International Recognition of the Deposit of Micro-organisms for the 

Purpose of Patent Procedure and is designated as ATCC 98188. 

S. cerevisiae YPH500 harboring plasmids pMCKIO, pMCK17, pMCK30 and pMCK35 
containing genes encoding glycerol-3-phosphate dehydrogenase (G3PDH) and glycerol-3- 
phosphatase (G3P phosphatase), glycerol dehydratase (dhaB), and 1,3-propanediol 
oxidoreductase (dhaT) was deposited on 26 September 1996 with the ATCC under the terms of 
the Budapest Treaty on the International Recognition of the Deposit of Micro-organisms for the 
Purpose of Patent Procedure and is designated as ATCC 74392. 

E.coli DH5a containing pKP1 which has about 35kb of a Klebsiella genome which contains 
the glycerol dehydratase, protein X and proteins 1, 2 and 3 was deposited on 18 April 1995 with 
the ATCC under the terms of the Budapest Treaty and was designated ATCC 69789. E.coli DH5a 
containing pKP4 containing a portion of the Klebsiella genome encoding diol dehydratase 
enzyme, including protein X was deposited on 18 April 1995 with the ATCC under the terms of the 
Budapest Treaty and was designated ATCC 69790. 

"ATCC" refers to the American Type Culture Collection international depository located at 
12301 Parklawn Drive, Rockville, MD 20852 U.S.A. The designations refer to the accession 
number of the deposited material. 
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Detailed Description of the Invention 

The present ^invention relates to the production of 1,3-propanediol in a single 

microorganism and provides improved methods for production of 1,3-propanediol from a 

fermentable carbon source in a single recombinant organism. The method incorporates a 

microorganism capable of producing 1,3-propanediol comprising either homologous or 

heterologous genes encoding dehydratase {dhaB), at least one gene encoding protein X and 

optionally at least one of the genes encoding a protein selected from the group consisting of 

protein 1, protein 2 and protein 3. Optionally, the microorganism contains at least one gene 

encoding glycerot-3-phosphate dehydrogenase, glycerol-3-phosphatase and 1, 3-propanediol 

oxidoreductase (dhaT). The recombinant microorganism is contacted with a carbon substrate and 

1,3-propanediol is isolated from the growth media. 

The present method provides a rapid, inexpensive and environmentally responsible 
source of 1 , 3-propanediol monomer useful in the production of polyesters and other polymers. 

The following definitions are to be used to interpret the claims and specification. 

The term "dehydratase gene cluster" or "gene cluster" refers to the set of genes which 
are associated with 1 ,3-propanedioi production in a host cell and is intended to encompass 
glycerol dehydratase gene dusters as well as diol dehydratase gene clusters. The dha regulon 
refers to a glycerol dehydratase gene cluster, as illustrated in Figure 1 which includes regulatory 
regions. 

The term "regenerating the dehydratase activity" or "reactivating the dehydratase activity" 
refers to the phenomenon of converting a dehydratase not capable of catalysis of a substrate to 
one capable of catalysis of a substrate or to the phenomenon of inhibiting the inactivation of a 
dehydratase or the phenomenon of extending the useful halflife of the dehydratase enzyme in 
vivo. 

The terms "glycerol dehydratase" or "dehydratase enzyme" or "dehydratase activity" refer 
to the polypeptide(s) responsible for an enzyme activity that is capable of isomerizing or 
converting a glycerol molecule to the product 3-hydroxypropionaldehyde. For the purposes of the 
present invention the dehydratase enzymes include a glycerol dehydratase (GenBank U09771, 
U30903) and a diol dehydratase (GenBank D45071) having preferred substrates of glycerol and 
1,2-propanediol, respectively. Glycerol dehydratase of K. pneumoniae ATCC 25955 is encoded 
by the genes dhaB1 y dhaB2> and dhaB3 identified as SEQ ID NOS:1, 2 and 3, respectively. The 
dhaB1, dhaB2, and dhaB3 genes code for the a, b, and c subunits of the glycerol dehydratase 
enzyme, respectively. 

The phrase "protein X of a dehydratase gene cluster" or "dhaB protein X" or "protein X" 
refers to a protein that is comparable to protein X of the Klebsiella pneumoniae dehydratase gene 
cluster as shown in Figure 2 or alternatively comparable to protein X of Klebsiella pneumoniae 
diol dehydratase gene cluster as shown in Figure 8. Preferably protein X is capable of increasing 
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the production of 1 ,3-propanediol in a host organism over the production of 1 ,3-propanediol in the 
absence of protein X in the host organism. Being comparable means that DNA encoding the 
protein is either in the same structural location as DNA encoding Klebsiella protein X with respect 
to Klebsiella dhaB1, dhaB2 and dhaB3. i.e., DNA encoding protein X is 3' to nucleic acid encoding 
dhaB1-B3, or that protein X has overall amino acid similarity to either Klebsiella diol or glycerol 
dehydratase protein X. The present invention encompasses protein X molecules having at least 
50%; or at least 65 %; or at least 80%; or at least 90% or at least 95% similarity to the protein X ef 
K. pneumoniae glycerol or diol dehydratase or the C. freundii protein X. 

Included within the term "protein X" is protein X, also referred to as ORF 2, from 
C/frobacfer dha regulon (Segfried M. 1996, J. Bacteriol. 178: 5793:5796). The present invention 
also encompasses amino acid variations of protein X from any microorganism as long as the 
protein X variant retains its essential functional characteristics of increasing the production of 1 ,3- 
propanediol in a host organism over the production of 1 ,3-propanediol in the host organism in the 
absence of protein X. 

A portion of the Klebsiella genome encoding the glycerol dehydratase enzyme activity as 
well as protein X was transformed into E.coli and the transformed E.coli was deposited on 18 
April 1995 with the ATCC under the terms of the Budapest Treaty and was designated as ATCC 
accession number 69789. A portion of the Klebsiella genome encoding the diol dehydratase 
enzyme activity as well as protein X was transformed into E.coli and the transformed E.coli was 
deposited on 18 April 1995 with the ATCC under the terms of the Budapest Treaty and was 
designated as ATCC accession number 69790. 

Klebsiella glycerol dehydratase protein X is found at bases 9749-11572 of SEQ ID NO:19, 
counting the f.rst base of dhaK as position number 1 . Citrobacter freundii (ATCC accession 
number CFU09771) nucleic acid encoding protein X is found between positions 1 1261 and 13072. 

The present invention encompasses genes encoding dehydratase protein X that are 
recombinantly introduced and replicate on a plasmid in the host organism as well as genes that 
are stably maintained in the host genome. The present invention encompasses a method for 
enhanced production of 1,3-propanediol wherein the gene encoding protein X is transformed in a 
host cell together with genes encoding the dehydratase activity and/or other genes necessary for 
the production of 1 ,3-propanediol. The gene encoding protein X, dehydratase activity and/or 
other genes may be on the same or different expression cassettes. Alternatively, the gene 
encoding protein X may be transformed separately, either before or after genes encoding the 
dehydratase activity and/or other activities. The present invention encompasses host cell having 
endogenous nucleic acid encoding protein X as well as host cell lacking endogenous nucleic acid 
encoding protein X. 
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The terms "protein 1", protein T and "protein 3 n refer to the proteins encoded in a 
microorganism that are comparable to protein 1 (SEQ ID NO: 60 or SEQ ID NO: 61)(a!so referred 
to as orfY), protein 2 (SEQ ID NO: 62 or SEQ ID NO: 63) (also referred to as orfX) and protein 3 
(SEQ ID NO: 64 or SEQ ID NO: 65) (also referred to as orfW), respectively. 

Preferably, in the presence of protein X, at least one of proteins 1, 2 and 3 is capable of 
increasing the production of 1,3-propanediol in a host organism over the production of 1,3- 
propanediol in the absence of protein X and at least one of proteins 1, 2 and 3 in the host 
organism. Being comparable means that DNA encoding the protein is either in the same 
structural location as DNA encoding the respective proteins, as shown in Figure 1, or that the 
respective proteins have overall amino acid similarity to the respective SEQ ID NOS shown in 
Figures 3, 4 and 5. 

The present invention encompasses protein 1 molecules having at least 50%; or at least 
65 %; or at least 80%; or at least 90% or at least 95% similarity to SEQ ID NO: 60 or SEQ ID NO: 
61 . The present invention encompasses protein 2 molecules having at least 50%; or at least 65 
%; or at least 80%; or at least 90% or at least 95% similarity to SEQ ID NO: 62 or SEQ ID NO: 63. 
The present invention encompasses protein 3 molecules having at least 50%; or at least 65 %; or 
at least 80%; or at least 90% or at least 95% similarity to SEQ ID NO: 64 or SEQ ID NO: 65. 

Included within the terms "protein 1 w , "protein 2" and "protein 3", respectively, are orfY, 
orfX and orfW from Clostridium pasteurianum (Luers, et a!., supra) as well as molecules having at 
least 50%; or at least 65 %; or at least 80%; or at least 90% or at least 95% similarity to C. 
pasterurianum orfY, orfX or orfW. The present invention also encompasses amino acid variations 
of proteins 1, 2 and 3 from any microorganism as long as the protein variant, in combination with 
protein X, retains its essential functional characteristics of increasing the production of 1,3- 
propanediol in a host organism over the production of 1 ,3-propanediol in the host organism in 
their absence. 

The present invention encompasses a method for enhanced production of 1,3-propanediol 
wherein the gene(s) encoding at least one of protein 1, protein 2 and protein 3 is transformed in a 
host cell together with genes encoding protein X, the dehydratase activity and/or other genes 
necessary for the production of 1,3-propanediol. The gene(s) encoding at least on of proteins 1, 2 
and 3, protein X, dehydratase activity and/or other genes may be on the same or different 
expression cassettes. Alternatively, the gene(s) encoding at least one of proteins 1, 2 and 3 may 
be transformed separately, either before or after genes encoding the dehydratase activity and/or 
other activities. The present invention encompasses host cell having endogenous nucleic acid 
encoding protein 1, protein 2 or protein 3 as well as host cell lacking endogenous nucleic acid 
encoding the proteins. 
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The terms "oxidoreductase" or "1,3-propanediol oxidoreductase" refer to the 
polypeptide(s) responsible for an enzyme activity that is capable of catalyzing the reduction of 
3-hydroxypropionaldehyde to 1,3-propanediol. 1,3-Propanediol oxidoreductase includes, for 
example, the polypeptide encoded by the dnaTgene (GenBank U09771, U30903) and is identified 
as SEQ ID NO:4. 

The terms "glycerol-3-phosphate dehydrogenase" or "G3PDH" refer to the polypeptide(s) 
responsible for an enzyme activity capable of catalyzing the conversion of dihydroxyacetone - 
phosphate (DHAP) to glycerol-3-phosphate (G3P). In vivo G3PDH may be NADH-, NADPH-, or 
FAD-dependent. Examples of this enzyme activity include the following: NADH-dependent 
enzymes (EC 1.1.1.8) are encoded by several genes including GPD1 (GenBank Z7407 1x2) or 
GPD2 (GenBank Z35169x1) or GPD3 (GenBank G984182) or DAR1 (GenBank Z74071x2); a 
NADPH-dependent enzyme (EC 1.1.1.94) is encoded by gpsA (GenBank U32164, G466746 (cds 
197911-196892), and L45246); and FAD-dependent enzymes (EC 1.1.99.5) are encoded by 
GUT2 (GenBank Z47047x23) or glpD (GenBank G 147838) or glpABC (GenBank M20938). 

The terms "glycerol-3-phosphatase" or "sn-glycerol-3-phosphatase" or "d,l-glycerol 
phosphatase" or "G3P phosphatase" refer to the polypeptide(s) responsible for an enzyme activity 
that is capable of catalyzing the conversion of glycerol-3-phosphate to glycerol. G3P 
phosphatase includes, for example, the polypeptides encoded by GPP1 (GenBank Z47047x125) 
orGPP2 (GenBank U18813x11). 

The term "glycerol kinase" refers to the polypeptide(s) responsible for an enzyme activity 
capable of catalyzing the conversion of glycerol to glycerol-3-phosphate or glycerol-3-phosphate 
to glycerol, depending on reaction conditions. Glycerol kinase includes, for example, the _ 
polypeptide encoded by GUT1 (GenBank U1 1583x19). 

The terms "GPD1", "DAR1", "OSG1", "D2830", and "YDL022W" will be used 
interchangeably and refer to a gene that encodes a cytosolic glycerol-3-phosphate 
dehydrogenase and characterized by the base sequence given as SEQ ID NO:5. 

The term "GPD2" refers to a gene that encodes a cytosolic glycerol-3-phosphate 
dehydrogenase and characterized by the base sequence given as SEQ ID NO:6. 

The terms "GUT2" and "YIL155C" are used interchangably and refer to a gene that 
encodes a mitochondrial glycerol-3-phosphate dehydrogenase and characterized by the base 

sequence given in SEQ ID NO:7. 

The terms "GPP1", "RHR2" and "YIL053W are used interchangably and refer to a gene 
that encodes a cytosolic glycerol-3-phosphatase and characterized by the base sequence given 
as SEQ IDNO:8. 
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The terms "GPP2", M HOR2" and ,, YER062C ,, are used interchangably and refer to a gene 
that encodes a cytosolic glycerol-3-phosphatase and characterized by the base sequence given 
as SEQ ID NO:9. 

The term "GUT1" refers to a gene that encodes a cytosolic glycerol kinase and 
characterized by the base sequence given as SEQ ID NO: 10. 

The terms "function" or "enzyme function" refer to the catalytic activity of an enzyme in 
altering the energy required to perform a specific chemical reaction. It is understood that such an 
activity may apply to a reaction in equilibrium where the production of either product or substrate 
may be accomplished under suitable conditions. 

The terms "polypeptide" and "protein" are used interchangeably. 
The terms "carbon substrate" and "carbon source" refer to a carbon source capable of 
being metabolized by host organisms of the present invention and particularly carbon sources 
selected from the group consisting of monosaccharides, oligosaccharides, polysaccharides, and 
one-carbon substrates or mixtures thereoL- 

The terms "host cell" or "host organism" refer to a microorganism capable of receiving 
foreign or heterologous genes and of expressing those genes to produce an active gene product. 

The terms "foreign gene", "foreign DNA", "heterologous gene" and "heterologous DNA" 
refer to genetic material native to one organism that has been placed within a host organism by 
various means. The gene of interest may be a naturally occurring gene, a mutated gene or a 
synthetic gene. 

The terms "recombinant organism" and "transformed host" refer to any organism having 
been transformed with heterologous or foreign genes or extra copies of homolgous genes. The 
recombinant organisms of the present invention express foreign genes encoding glycero- 
phosphate dehydrogenase (G3PDH) and glycerol-3-phosphatase (G3P phosphatase), glycerol 
dehydratase (dhaB) t and 1,3-propanedio! oxidoreductase (dhaT) for the production of 
1,3-propanediol from suitable carbon substrates. 

"Gene" refers to a nucleic acid fragment that expresses a specific protein, including 
regulatory sequences preceding (5' non-coding) and following (3' non-coding) the coding region. 
The terms "native" and "wild-type" refer to a gene as found in nature with its own regulatory 
sequences. 

The terms "encoding" and "coding" refer to the process by which a gene, through the 
mechanisms of transcription and translation, produces an amino acid sequence. It is understood 
that the process of encoding a specific amino acid sequence includes DNA sequences that may 
involve base changes that do not cause a change in the encoded amino acid, or which involve 
base changes which may alter one or more amino acids, but do not affect the functional 
properties of the protein encoded by the DNA sequence. It is therefore understood that the 
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invention encompasses more than the specific exemplary sequences. Modifications to the 
sequence, such as deletions, insertions, or substitutions in the sequence which produce silent 
changes that do not substantially affect the functional properties of the resulting protein molecule 
are also contemplated. For example, alteration in the gene sequence which reflect the 
degeneracy of the genetic code, or which result in the production of a chemically equivalent amino 
acid at a given site, are contemplated. Thus, a codon for the amino acid alanine, a hydrophobic 
amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as 
glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes 
which result in substitution of one negatively charged residue for another, such as aspartic acid 
for glutamic acid, or one positively charged residue for another, such as lysine for arginine, can 
also be expected to produce a biologically equivalent product. Nucleotide changes which result in 
alteration of the N-terminal and C-terminal portions of the protein molecule would also not be 
expected to alter the activity of the protein. In some cases, it may in fact be desirable to make 
mutants of the sequence in order to study the effect of alteration on the biological activity of the 
protein. Each of the proposed modifications is well within the routine skill in the art, as is 
determination of retention of biological activity in the encoded products. Moreover, the skilled 
artisan recognizes that sequences encompassed by this invention are also defined by their ability 
to hybridize, under stringent conditions (0.1X SSC, 0.1% SDS, 65 °C), with the sequences 
exemplified herein. 

The term "expression" refers to the transcription and translation to gene product from a 
gene coding for the sequence of the gene product. 

The terms "plasmid", "vector", and "cassette" refer to an extra chromosomal element 
often carrying genes which are not part of the central metabolism of the cell, and usually in the 
form of circular double-stranded DNA molecules. Such elements may be autonomously 
replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or 
circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a 
number of nucleotide sequences have been joined or recombined into a unique construction 
which is capable of introducing a promoter fragment and DNA sequence for a selected gene 
product along with appropriate 3' untranslated sequence into a cell. Transformation cassette- 
refers to a specific vector containing a foreign gene and having elements in addition to the foreign 
gene that facilitate transformation of a particular host cell. "Expression cassette" refers to a 
specific vector containing a foreign gene and having elements in addition to the foreign gene that 
allow for enhanced expression of that gene in a foreign host. 

The terms "transformation" and "transfection" refer to the acquisition of new genes in a 
cell after the incorporation of nucleic acid. The acquired genes may be integrated into 
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chromosomal DNA or introduced as extrachromosomal replicating sequences. The term 
"transformant" refers to the product of a transformation. 

The term "genetically altered" refers to the process of changing hereditary material by 
transformation or mutation. 

The term "isolated u refers to a protein or DNA sequence that is removed from at least one 
component with which it is naturally associated. 

The term "homologous" refers to a protein or polypeptide native or naturally occurring in a 
gram-positive host cell. The invention includes microorganisms producing the homologous 
protein via recombinant DNA technology. 

CONSTRUCTION OF RECOMBINANT ORGANISMS 

Recombinant organisms containing the necessary genes that will encode the enzymatic 
pathway for the conversion of a carbon substrate to 1,3-propanediol may be constructed using 
techniques well known in the art. As discussed in Example 9, genes encoding Klebsiella dhaB1, 
dhaB2, dhaB3 and protein X were used to transform £. coii DH5a and in Example 10, genes 
encoding at least one of Klebsiella proteins 1, 2 and 3 as well as at least one gene encoding 
protein X was used to transform E.coli. 

Genes encoding glycerol-3-phosphate dehydrogenase (G3PDH), giycerol-3-phosphatase 
(G3P phosphatase), glycerol dehydratase (drtaS), and 1,3-propanediol oxidoreductase (dhaT) 
were isolated from a native host such as Klebsiella or Saccharomyces and used to transform host 
strains such as E. coli DH5a, ECL707, AA200, or W1485; the Saccharomyces cerevisiae strain 
YPH500; or the Klebsiella pneumoniae strains ATCC 25955 or ECL 2106. 
Isolation of Genes 

Methods of obtaining desired genes from a bacterial genome are common and well 
known in the art of molecular biology. For example, if the sequence of the gene is known, 
suitable genomic libraries may be created by restriction endonuclease digestion and may be 
screened with probes complementary to the desired gene sequence. Once the sequence is 
isolated, the DNA may be amplified using standard primer directed amplification methods such as 
polymerase chain reaction (PCR) (U.S. 4,683,202) to obtain amounts of DNA suitable for 
transformation using appropriate vectors. 

Alternatively, cosmid libraries may be created where large segments of genomic DNA 
(35-45kb) may be packaged into vectors and used to transform appropriate hosts. Cosmid 
vectors are unique in being able to accommodate large quantities of DNA. Generally, cosmid 
vectors have at least one copy of the cos DNA sequence which is needed for packaging and 
subsequent circularization of the foreign DNA. In addition to the cos sequence these vectors will 
also contain an origin of replication such as ColE1 and drug resistance markers such as a gene 
resistant to ampicillin or neomycin. Methods of using cosmid vectors for the transformation of 
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suitable bacterial hosts are well described in Sambrook et al., Molecular Cloning: A Laboratory 
Manual , Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbon, NY 
(1989). 

Typically to clone cosmids, foreign DNA is isolated and ligated, using the appropriate 
restriction endonucleases. adjacent to the cos region of the cosmid vector. Cosmid vectors 
containing the linearized foreign DNA is then reacted with a DNA packaging vehicle such as 
bacteriophage I. During the packaging process the cos sites are cleaved and the foreign DNA is. 
packaged into the head portion of the bacterial viral particle. These particles are then used to 
transfect suitable host cells such as E. coli. Once injected into the cell, the foreign DNA 
circularizes under the influence of the cos sticky ends. In this manner large segments of foreign 
DNA can be introduced and expressed in recombinant host cells. 

Isolation and cloning of penes encoding glycerol de hydratase (dhaB) and 1 ,3-propanediol oxido- 
reductase (dhaT) 

Cosmid vectors and cosmid transformation methods were used within the context of the 
present invention to clone large segments of genomic DNA from bacterial genera known to 
possess genes capable of processing glycerol to 1,3-propanediol. Specifically, genomic DNA 
from K. pneumoniae ATCC 25955 was isolated byjnethods well known in the art and digested 
with the restriction enzyme Sau3A for insertion into a cosmid vector Supercos 1 and packaged 
using Gigapackll packaging extracts. Following construction of the vector £. coli XL1-Blue MR 
cells were transformed with the cosmid DNA. Transformants were screened for the ability to 
convert glycerol to 1 ,3-propanediol by growing the cells in the presence of glycerol and analyzing 
the media for 1,3-propanediol formation. 

Two of the 1 ,3-propanediol positive transformants were analyzed and the cosmids were 
named pKP1 and pKP2. DNA sequencing revealed extensive homology to the glycerol 
dehydratase gene (dhaB) from C. freundii, demonstrating that these transformants contained DNA 
encoding the glycerol dehydratase gene. Other 1,3-propanediol positive transformants were 
analyzed and the cosmids were named pKP4 and pKP5. DNA sequencing revealed that these 
cosmids carried DNA encoding a diol dehydratase gene. 
Isolation of oenes encoding prote in X. protein 1. protein 2 and protein 3 

Although the instant invention utilizes the isolated genes from within a Klebsiella cosmid, 
alternate sources of dehydratase genes and protein X and protein 1. protein 2 and protein 3 
include, but are not limited to, Citrobacter, Clostridia, and Salmonella. Tobimatsu, et al., 1996, J. 
Biol. Chem. 271: 22352-22357 disclose the K. pneumoniae glycerol dehydratase operon where 
protein X is identified as ORF 4; Segfried et al.. 1995. J. Bacteriol. 178: 5793-5796 disclose the C. 
freundii glycerol dehydratase operon where protein X is identified as ORF Z. Figure 8 discloses 
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Klebsiella dio! dehydratase protein X and Figures 3, 4 and 5 disclose amino acid sequences of 
proteins 1, 2 and 3 from Klebsiella and Citrobacter. 
Genes encoding G3PDH and G3P phosphatase 

The present invention provides genes suitable for the expression of G3PDH and G3P 
phosphatase activities in a host cell. 

Genes encoding G3PDH are known. For example, GPD1 has been isolated from 
Saccharomyces and has the base sequence given by SEQ ID NO:5, encoding the amino acid - 
sequence given in SEQ ID NO:1 1 (Wang et al., supra). Similarly, G3PDH activity is has also 
been isolated from Saccharomyces encoded by GPD2 having the base sequence given in SEQ 
ID NO:6, encoding the amino acid sequence given in SEQ ID NO:12 (Eriksson et al., Moi 
Microbiol. 17, 95, (1995). 

It is contemplated that any gene encoding a polypeptide responsible for G3PDH activity is 
suitable for the purposes of the present invention wherein that activity is capable of catalyzing the 
conversion of dihydroxyacetone phosphate (DHAP) to glycerol-3-phosphate (G3P). Further, it is 
contemplated that any gene encoding the amino acid sequence of G3PDH as given by any one of 
SEQ ID NOS:11, 12, 13, 14, 15 and 16 corresponding to the genes GPD1, GPD2, GUT2, gpsA, 
glpD, and the a subunit of glpABC, respectively, will be functional in the present invention wherein 
that amino acid sequence encompasses amino acid substitutions, deletions or additions that do 
not alter the function of the enzyme. It will be appreciated by the skilled person that genes 
encoding G3PDH isolated from other sources are also be suitable for use in the present invention. 
For example, genes isolated from prokaryotes include GenBank accessions M34393, M20938, 
L06231, U12567, L45246, L45323, L45324, L45325, U32164, and U39682; genes isolated from 
fungi include GenBank accessions U30625, U30876 and X56162; genes isolated from insects 
include GenBank accessions X61223 and X14179; and genes isolated from mammalian sources 
include GenBank accessions U12424, M25558 and X78593. 

G enes encoding G3P phosphatase are known. For example, GPP2 has been isolated 
from Saccharomyces cerevisiae and has the base sequence given by SEQ ID NO:9 which 
encodes the amino acid sequence given in SEQ ID NO:17 (Norbeck et al., J. Biol. Chem. 271, 
p. 13875, 1996). 

It is contemplated that any gene encoding a G3P phosphatase activity is suitable for the 
purposes of the present invention wherein that activity is capable of catalyzing the conversion of 
glycerol-3-phosphate to glycerol. Further, it is contemplated that any gene encoding the amino 
acid sequence of G3P phosphatase as given by SEQ ID NOS:33 and 17 will be functional in the 
present invention wherein that amino acid sequence encompasses amino acid substitutions, 
deletions or additions that do not alter the function of the enzyme. It will be appreciated by the 
skilled person that genes encoding G3P phosphatase isolated from other sources are also 
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suitable for use in the present invention. For example, the dephosphoryiation of glycerol-3- 
phosphate to yield glycerol may be achieved with one or more of the following general or specific 
phosphatases: alkaline phosphatase (EC 3.1.3.1) {GenBank M19159, M29663, U02550 or 
M33965]; acid phosphatase (EC 3.1.3.2) [GenBank U51210, U19789, U28658 or L20566]; 
glycerol-3-phosphatase (EC 3.1.3.-) [GenBank 238060 or U18813x1 1]; glucose-1-phosphatase 
(EC 3.1.3.10) [GenBank M33807J; g!ucose-6-phosphatase (EC 3.1.3.9) [GenBank U00445]; 
fructose-1,6-bisphosphatase (EC 3.1.3.11) [GenBank X12545 or J03207] or phosphotidyl glycero 
phosphate phosphatase (EC 3.1.3.27) [GenBank M23546 and M23628]. 

Genes encoding glycerol kinase are known. For example, GUT1 encoding the glycerol 
kinase from Saccharomyces has been isolated and sequenced (Pavlik et at., Curr. Genet. 24, 21, 
(1993)) and the base sequence is given by SEQ ID NO:10 which encodes the amino acid 
sequence given in SEQ ID NO:18. It will be appreciated by the skilled artisan that although 
glycerol kinase catalyzes the degradation of glycerol in nature the same enzyme will be able to 
function in the synthesis of glycerol to convert glycerol-3-phosphate to glycerol under the 
appropriate reaction energy conditions. Evidence exists for glycerol production through a glycerol 
kinase. Under anaerobic or respiration-inhibited conditions, Trypanosoma brucei gives rise to 
glycerol in the presence of Glycerol-3-P and ADP. The reaction occurs in the glycosome 
compartment (D. Hammond, J. Biol. Chem. 260, 15646-15654, (1985)). 
Host cells 

Suitable host cells for the recombinant production of 1 ,3-propanediol may be either 
prokaryotic or eukaryotic and will be limited only by the host cell ability to express active enzymes. 
p re f errec j hosts will be those typically useful for production of glycerol or 1 ,3-propanediol such as 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
Hansenula, Debaryomyces, Mucor, Torulopsis, Methylobacter, Escherichia, Salmonella, Bacillus, 
Streptomyces and Pseudomonas. Most preferred in the present invention are E. coli, Klebsiella 
species and Saccharomyces species. 

Adenosyi-cobalamin (coenzyme B^) is an essential cofactor for glycerol dehydratase 
activity. The coenzyme is the most complex non-polymeric natural product known, and its 
synthesis in vivo is directed using the products of about 30 genes. Synthesis of coenzyme B-J2 is 
found in prokaryotes, some of which are able to synthesize the compound de novo, while others 
can perform partial reactions. E. coli, for example, cannot fabricate the corrin ring structure, but is 
able to catalyze the conversion of cobinamide to corrinoid and can introduce the 5-deoxyadenosyl 
group. 

Eukaryotes are unable to synthesize coenzyme B^ cfe novo and instead transport vitamin 
B-J2 from the extracellular milieu with subsequent conversion of the compound to its functional 
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form of the compound by cellular enzymes. Three enzyme activities have been described for this 
series of reactions. 1) aquacobalamin reductase (EC 1.6.99.8) reduces Co(lll) to Co(ll); 
2) cob(ll)alamin reductase (EC 1.6.99.9) reduces Co(ll) to Co(l); and 3) cob(l)alamin 
adenosyltransferase (EC 2.5.1.17) transfers a S'deoxyadenosine moiety from ATP to the reduced 
corrinoid. This last enzyme activity is the best characterized of the three, and is encoded by cobA 
in S. typhimurium, btuR in £ coli and cobO in P. denitrificans. These three cob(l)alamin 
adenosyltransferase genes have been cloned and sequenced. Cob(l)alamin adenosyltransferase 
activity has been detected in human fibroblasts and in isolated rat mitochondria (Fenton et al., 
Biochem. Biophys. Res. Commun. 98, 283-9, (1981)). The two enzymes involved in cobalt 
reduction are poorly characterized and gene sequences are not available. There are reports of an 
aquacobalamin reductase from Euglena gracilis (Watanabe et al., Arch. Biochem. Biophys. 305, 
421-7, (1993)) and a microsomal cob(IM)alamin reductase is present in the microsomal and 
mitochondrial inner membrane fractions from rat fibroblasts (Pezacka, Biochim. Biophys. Acta, 
1157, 167-77, (1993)). 

Supplementing culture media with vitamin B^2 ma y satisfy the need to produce coenzyme 
B-J2 * or glycerol dehydratase activity in many microorganisms, but in some cases additional 
catalytic activities may have to be added or increased in vivo. Enhanced synthesis of coenzyme 
B-J2 ' n eukaryotes may be particularly desirable. Given the published sequences for genes 
encoding cob(l)alamin adenosyltransferase, the cloning and expression of this gene could be 
accomplished by one skilled in the art. For example, it is contemplated that yeast, such as 
Saccharomyces, could be constructed so as to contain genes encoding cob(l)aiamin 
adenosyltransferase in addition to the genes necessary to effect conversion of a carbon substrate 
such as glucose to 1 ,3-propanediol. Cloning and expression of the genes for cobalt reduction 
requires a different approach. This could be based on a selection in E. coli for growth on 
ethanolamine as sole N2 source. In the presence of coenzyme B12 ethanolamine ammonia-lyase 
enables growth of cells in the absence of other N2 sources. If E. coli cells contain a cloned gene 
for cob(l)alamin adenosyltransferase and random cloned DNA from another organism, growth on 
ethanolamine in the presence of aquacobalamin should be enhanced and selected for if the 
random cloned DNA encodes cobalt reduction properties to facilitate adenosylation of 
aquacobalamin. 

Glycerol dehydratase is a multi-subunit enzyme consisting of three protein components 
which are arranged in an a 2 b 2 g2 configuration (M. Seyfried et al, J. Bacterid. , 5793-5796 (1996)). 
This configuration is an inactive apo-enzyme which binds one molecule of coenzyme B12 to 
become the catalytically active holo-enzyme. During catalysis, the holo-enzyme undergoes rapid, 
first order inactivation, to become an inactive complex in which the coenzyme B 12 has been 
converted to hydroxycobalamin (2. Schneider and J. Pawelkiewicz, ACTA Biochim. Pol. 31 1-328 
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(1966)). Stoichiometric analysis of the reaction of glycerol dehydratase with glycerol as substrate 
revealed that each molecule of enzyme catalyzes 100,000 reactions before inactivation (Z. 
Schneider and J. Pawelkiewicz, ACTA Biochim. Pol. 311-328 (1966)). In vitro, this inactive 
complex can only be reactivated by removal of the hydroxycobalamin, by strong chemical 
treatment with magnesium and sulfite, and replacement with additional coenzyme Bi 2 (Z. 
Schneider et al. ( J. Bioi. Chem. 3388-3396 (1970)). Inactivated glycerol dehydratase in wild type 
Klebsiella pneumoniae can be reactivated in situ (toluenized cells) in the presence of coenzyme - 
B 12 , adenosine S'-triphosphate (ATP), and manganese (S. Honda et al, J. Bacteriol. 1458-1465 
(1980)). This reactivation was shown to be due to the ATP dependent replacement of the 
inactivated cobalamin with coenzyme B 12 (K. Ushio et al., J. Nutr. Sci. Vitaminol. 225-236 (1982)). 
Cell extract from toluenized cells which in situ catalyze the ATP, manganese, and coenzyme B 12 
dependent reactivation are inactive with respect to this reactivation. Thus, without strong 
chemical reductive treatment or cell mediated replacement of the inactivated cofactor, glycerol 
dehydratase can only catalyzed 100,000 reactions per molecule. 

The present invention demonstrates that the presence of protein X is important for in vivo 
reactivation of the dehydratase and the production of 1,3-propanediol is increased in a host cell 
capable of producing 1 ,3-propanediol in the presence of protein X. The present invention also 
discloses that the presence of protein 1, protein 2 and protein 3, in combination with protein X, 
also increased the production of 1,3-propanediol in a host cell capable of producing 1,3- 
propanediol. 

In addition to E. coli and Saccharomyces, Klebsiella is a particularly preferred host. 
Strains of Klebsiella pneumoniae are known to produce 1,3-propanediol when grown on glycerol 
as the sole carbon. It is contemplated that Klebsiella can be genetically altered to produce 
1,3-propanediol from monosaccharides, oligosaccharides, polysaccharides, or one-carbon 
substrates. 

In order to engineer such strains, it will be advantageous to provide the Klebsiella host 
with the genes facilitating conversion of dihydroxyacetone phosphate to glycerol and conversion 
of glycerol to 1 ,3-propanediol either separately or together, under the transcriptional control of one 
or more constitutive or inducible promoters. The introduction of the DAR1 and GPP2 genes 
encoding glycerol-3-phosphate dehydrogenase and glycerol-3-phosphatase, respectively, will 
provide Klebsiella with genetic machinery to produce 1,3-propanediol from an appropriate carbon 
substrate. 

The genes encoding protein X, protein 1, protein 2 and protein 3 or other enzymes 
associated with 1,3-propanediol production (e.g., G3PDH, G3P phosphatase, dhaB and/or dhaT) 
may be introduced on any plasmid vector capable of replication in K pneumoniae or they may be 
integrated into the K. pneumoniae genome. For example, K pneumoniae ATCC 25955 and 
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K. pneumoniae ECL 2106 are known to be sensitive to tetracycline or chloramphenicol; thus 
plasmid vectors which are both capable of replicating in K. pneumoniae and encoding resistance 
to either or both of these antibiotics may be used to introduce these genes into K. pneumoniae. 
Methods of transforming Klebsiella with genes of interest are common and well known in the art 
and suitable protocols, including appropriate vectors and expression techniques may be found in 
Sambrook, supra. 
Vectors and expression cassettes 

The present invention provides a variety of vectors and transformation and expression 
cassettes suitable for the cloning, transformation and expression of protein X, protein 1, protein 2 
and protein 3 as well as other proteins associated with 1 ,3-propanediol production, e.g., G3PDH 
and G3P phosphatase into a suitable host cell. Suitable vectors will be those which are 
compatible with the bacterium employed. Suitable vectors can be derived, for example, from a 
bacteria, a virus (such as bacteriophage T7 or a M-13 derived phage), a cosmid, a yeast or a 
plant. Protocols for obtaining and using such vectors are known to those in the art. (Sambrook et 
al., Molecular Cloning: A Laboratory Manual - volumes 1 ,2,3 (Cold Spring Harbor Laboratory, 
Cold Spring Harbor, NY, (1989)). 

Typically, the vector or cassette contains sequences directing transcription and translation 
of the relevant gene, a selectable marker, and sequences allowing autonomous replication or 
chromosomal integration. Suitable vectors comprise a region 5' of the gene which harbors 
transcriptional initiation controls and a region 3' of the DNA fragment which controls 
transcriptional termination. It is most preferred when both control regions are derived from genes 
homologous to the transformed host cell although it is to be understood that such control regions 
need not be derived from the genes native to the specific species chosen as a production host. 

Initiation control regions or promoters, which are useful to drive expression of the protein 
x and protein 1, protein 2 or protein 3 in the desired host cell, are numerous and familiar to those 
skilled in the art. Virtually any promoter capable of driving these genes is suitable for the present 
invention including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PH05, GAPDH, 
ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces)\ AOX1 (useful 
for expression in Pichia)\ and lac, trp, IP L , IP Rl T7, tac, and trc (useful for expression in E. coli). 

Termination control regions may also be derived from various genes native to the 
preferred hosts. Optionally, a termination site may be unnecessary, however, it is most preferred 
if included. 

For effective expression of the instant enzymes, DNA encoding the enzymes are linked 
operably through initiation codons to selected expression control regions such that expression 
results in the formation of the appropriate messenger RNA. 
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Transformation of suitable hosts and express ion of genes for the 
production of 1.3-propanediol 

Once suitable cassettes are constructed they are used to transform appropriate host cells. 
Introduction of the cassette containing dhaB activity, dhaB protein X and at least one of protein 1, 
protein 2 and protein 3 and optionally 1,3-propanediol oxidoreductase (dhaT), either separately or 
together, into the host cell may be accomplished by known procedures such as by transformation 
(e.g., using calcium-permeabilized cells, electroporation) or by transfection using a recombinant 
phage virus. (Sambrook et al., supra.). In the present invention, E.coli DH5a was transformed 
with dhaB subunits 1 , 2 and 3 and dha protein X. 

Additionally, E. coli W2042 (ATCC 98188) containing the genes encoding glycerol-3- 
phosphate dehydrogenase (G3PDH) and glycerol-3-phosphatase (G3P phosphatase), glycerol 
dehydratase (dhaB), and 1,3-propanediol oxidoreductase (dhaT) was created. Additionally, 
S. cerevisiae YPH500 (ATCC 74392) harboring plasmids pMCKIO. pMCK17, pMCK30 and 
pMCK35 containing genes encoding glycerol-3-phosphate dehydrogenase (G3PDH) and 
glycerol-3-phosphatase (G3P phosphatase), glycerol dehydratase (dhaB), and 1,3-propanediol 
oxidoreductase (dhaT) was constructed. Both the above-mentioned transformed E. coli and 
Saccharomyces represent preferred embodiments of the invention. 
Media and Carbon Substrates : 

Fermentation media in the present invention must contain suitable carbon substrates. 
Suitable substrates may include but are not limited to monosaccharides such as glucose and 
fructose, oligosaccharides such as lactose or sucrose, polysaccharides such as starch or 
cellulose, or mixtures thereof, and unpurified mixtures from renewable feedstocks such as cheese 
whey permeate, cornsteep liquor, sugar beet molasses, and barley malt. Additionally, the carbon 
substrate may also be one-carbon substrates such as carbon dioxide, or methanol for which 
metabolic conversion into key biochemical intermediates has been demonstrated. Glycerol 
production from single carbon sources (e.g., methanol, formaldehyde, or formate) has been 
reported in methylotrophic yeasts (Yamada et al., Agric. Biol. Chem., 53(2) 541-543, (1989)) and 
in bacteria (Hunter et.at., Biochemistry, 24, 4148-4155, (1985)). These organisms can assimilate 
single carbon compounds, ranging in oxidation state from methane to formate, and produce 
glycerol. The pathway of carbon assimilation can be through ribulose monophosphate, through 
serine, or through xylulose-momophosphate (Gottschalk, Bacterial Metabolism, Second Edition, 
Springer- Verlag: New York (1986)). The ribulose monophosphate pathway involves the 
condensation of formate with ribulose-5-phosphate to form a 6 carbon sugar that becomes 
fructose and eventually the three carbon product glyceraldehyde-3-phosphate. Likewise, the 
serine pathway assimilates the one-carbon compound into the glycolytic pathway via 
methylenetetrahydrofolate. 
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In addition to utilization of one and two carbon substrates, methylotrophic organisms are 
also known to utilize a number of other carbon-containing compounds such as methylamine, 
glucosamine and a variety of amino acids for metabolic activity. For example, methylotrophic 
yeast are known to utilize the carbon from methylamine to form trehalose or glycerol (Bellion et 
at., Microb. Growth C1 Compel., [Int. Symp.], 7th (1993), 415-32. Editor(s): Murrell, J. Collin; 
Kelly, Don P. Publisher: Intercept, Andover, UK). Similarly, various species of Candida will 
metabolize alanine or oleic acid (Suiter et al., Arch. Microbiol., 153(5), 485-9 (1990)). Hence, the 
source of carbon utilized in the present invention may encompass a wide variety of 
carbon-containing substrates and will only be limited by the requirements of the host organism. 

Although it is contemplated that all of the above mentioned carbon substrates and 
mixtures thereof are suitable in the present invention, preferred carbon substrates are 
monosaccharides, oligosaccharides, polysaccharides, and one-carbon substrates. More 
preferred are sugars such as glucose, fructose, sucrose and single carbon substrates such as 
methanol and carbon dioxide. Most preferred is glucose. 

In addition to an appropriate carbon source, fermentation media must contain suitable 
minerals, salts, cofactors, buffers and other components, known to those skilled in the art, 
suitable for the growth of the cultures and promotion of the enzymatic pathway necessary for 
glycerol production. Particular attention is given to Co(ll) salts and/or vitamin B12 or precursors 
thereof. 

Culture Conditions : 

Typically, cells are grown at 30 °C in appropriate media. Preferred growth media in the 
present invention are common commercially prepared media such as Luria Bertani (LB) broth, 
Sabouraud Dextrose (SD) broth or Yeast Malt Extract (YM) broth. Other defined or synthetic 
growth media may also be used and the appropriate medium for growth of the particular 
microorganism will be known by someone skilled in the art of microbiology or fermentation 
science. The use of agents known to modulate catabolite repression directly or indirectly, e.g., 
cyclic adenosine 2':3-monophosphate or cyclic adenosine 2*:5 , -monophosphate, may also be 
incorporated into the reaction media. Similarly, the use of agents known to modulate enzymatic 
activities (e.g., sulphites, bisulphites and alkalis) that lead to enhancement of glycerol production 
may be used in conjunction with or as an alternative to genetic manipulations. 

Suitable pH ranges for the fermentation are between pH 5.0 to pH 9.0, where pH 6.0 to 
pH 8.0 is preferred as range for the initial condition. 

Reactions may be performed under aerobic or anaerobic conditions where anaerobic or 
microaerobic conditions are preferred. 
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Batch and Continuous Fermentations : 

The present process uses a batch method of fermentation. A classical batch fermentation 
is a closed system where the composition of the media is set at the beginning of the fermentation 
and not subject to artificial alterations during the fermentation. Thus, at the beginning of the 
fermentation the media is inoculated with the desired organism or organisms and fermentation is 
permitted to occur adding nothing to the system. Typically, however, a batch fermentation is 
"batch" with respect to the addition of the carbon source and attempts are often made at 
controlling factors such as pH and oxygen concentration. The metabolite and biomass 
compositions of the batch system change constantly up to the time the fermentation is stopped. 
Within batch cultures cells moderate through a static lag phase to a high growth log phase and 
finally to a stationary phase where growth rate is diminished or halted. If untreated, cells in the 
stationary phase will eventually die. Cells in log phase generally are responsible for the bulk of 
production of end product or intermediate. 

A variation on the standard batch system is the Fed-Batch fermentation system which is 
also suitable in the present invention. In this variation of a typical batch system, the substrate is 
added in increments as the fermentation progresses. Fed-Batch systems are useful when 
catabolite repression is apt to inhibit the metabolism of the cells and where it is desirable to have 
limited amounts of substrate in the media. Measurement of the actual substrate concentration in 
Fed-Batch systems is difficult and is therefore estimated on the basis of the changes of 
measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases such 
as C0 2 . Batch and Fed-Batch fermentations are common and well known in the art and 
examples may be found in Brock, supra. 

It is also contemplated that the method would be adaptable to continuous fermentation 
methods. Continuous fermentation is an open system where a defined fermentation media is 
added continuously to a bioreactor and an equal amount of conditioned media is removed 
simultaneously for processing. Continuous fermentation generally maintains the cultures at a 
constant high density where cells are primarily in log phase growth. 

Continuous fermentation allows for the modulation of one factor or any number of factors 
that affect cell growth or end product concentration. For example, one method will maintain a 
limiting nutrient such as the carbon source or nitrogen level at a fixed rate and allow all other 
parameters to moderate. In other systems a number of factors affecting growth can be altered 
continuously while the cell concentration, measured by media turbidity, is kept constant. 
Continuous systems strive to maintain steady state growth conditions and thus the cell loss due to 
media being drawn off must be balanced against the cell growth rate in the fermentation. 
Methods of modulating nutrients and growth factors for continuous fermentation processes as 
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well as techniques for maximizing the rate of product formation are well known in the art of 
industrial microbiology and a variety of methods are detailed by Brock, supra. 

The present invention may be practiced using either batch, fed-batch or continuous 
processes and that any known mode of fermentation would be suitable. Additionally, it is 
contemplated that cells may be immobilized on a substrate as whole cell catalysts and subjected 
to fermentation conditions for 1 ,3-propanediol production. 
Alterations in the 1 ,3-propanediol production pathway : 

Representative enzyme pathway . The production of 1 ,3-propanediol from glucose can be 
accomplished by the following series of steps. This series is representative of a number of 
pathways known to those skilled in the art. Glucose is converted in a series of steps by enzymes 
of the glycolytic pathway to dihydroxyacetone phosphate (DHAP) and 3-phosphoglyceraldehyde 
(3-PG). Glycerol is then formed by either hydrolysis of DHAP to dihydroxyacetone (DHA) followed 
by reduction, or reduction of DHAP to glycerol 3-phosphate (G3P) followed by hydrolysis. The 
hydrolysis step can be catalyzed by any number of cellular phosphatases which are known to be 
specific or non-specific with respect to their substrates or the activity can be introduced into the 
host by recombination. The reduction step can be catalyzed by a NAD + (or NADP+) linked host 
enzyme or the activity can be introduced into the host by recombination. It is notable that the dha 
regulon contains a glycerol dehydrogenase (E.C. 1.1.1.6) which catalyzes the reversible reaction 
of Equation 3. 



Glycerol is converted to 1,3-propanedioi via the intermediate 3-hydroxypropiona!dehye (3-HP) as 
has been described in detail above. The intermediate 3-HP is produced from glycerol 
(Equation 1) by a dehydratase enzyme which can be encoded by the host or can introduced into 
the host by recombination. This dehydratase can be glycerol dehydratase (E.C. 4.2.1.30), diol 
dehydratase (E.C. 4.2.1 .28), or any other enzyme able to catalyze this transformation. Glycerol 
dehydratase, but not diol dehydratase, is encoded by the dha regulon. 1 ,3-Propanediol is 
produced from 3-HP (Equation 2) by a NAD + - (or NADP+) linked host enzyme or the activity can 
introduced into the host by recombination. This final reaction in the production of 1 ,3-propanediol 
can be catalyzed by 1 ,3-propanediol dehydrogenase (E.C. 1.1.1.202) or other alcohol 
dehydrogenases. 

Mutations and transformations that affect carbon channeling . A variety of mutant organisms 
comprising variations in the 1 ,3-propanediol production pathway will be useful in the present 



Glycerol ® 3-HP + H 2 0 
3-HP + NADH + H+ ® 1 ,3-Propanediol + NAD+ 

Glycerol + NAD + ® DHA + NADH + H + 



(Equation 1) 
(Equation 2) 
(Equation 3) 
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invention. The introduction of a triosephosphate isomerase mutation (fp/-) into the microorganism 
is an example of the use of a mutation to improve the performance by carbon channeling. 
Alternatively, mutations which diminish the production of ethanoi (adh) or lactate {Idh) will 
increase the availability of NADH for the production of 1,3-propanediol. Additional mutations in 
steps of glycolysis after glyceraldehyde-3-phosphate such as phosphoglycerate mutase (pgm) 
would be useful to increase the flow of carbon to the 1,3-propanediol production pathway. 
Mutations that effect glucose transport such as PTS which would prevent loss of PEP may also 
prove useful. Mutations which block alternate pathways for intermediates of the 1,3-propanediol 
production pathway such as the glycerol catabolic pathway (gfp) would also be useful to the 
present invention. The mutation can be directed toward a structural gene so as to impair or 
improve the activity of an enzymatic activity or can be directed toward a regulatory gene so as to 
modulate the expression level of an enzymatic activity. 

Alternatively, transformations and mutations can be combined so as to control particular 
enzyme activities for the enhancement of 1,3-propanediol production. Thus it is within the scope 
of the present invention to anticipate modifications of a whole cell catalyst which lead to an 
increased production of 1,3-propanediol. 
Identification and purification of 1,3-propanediol : 

Methods for the purification of 1,3-propanediol from fermentation media are known in the 
art. For example, propanediols can be obtained from cell media by subjecting the reaction 
mixture to extraction with an organic solvent, distillation and column chromatography 
(U.S. 5,356,812). A particularly good organic solvent for this process is cyclohexane 
(U.S. 5,008,473). 

1,3-Propanediol may be identified directly by submitting the media to high pressure liquid 
chromatography (HPLC) analysis. Preferred in the present invention is a method where 
fermentation media is analyzed on an analytical ion exchange column using a mobile phase of 
0.01 N sulfuric acid in an isocratic fashion. 
Identification and purification of G3PDH and G3P phosphatase : 

The levels of expression of the proteins G3PDH and G3P phosphatase are measured by 
enzyme assays, G3PDH activity assay relied on the spectral properties of the cosubstrate, NADH, 
in the DHAP conversion to G-3-P. NADH has intrinsic UV/vis absorption and its consumption can 
be monitored spectrophotometrically at 340 nm. G3P phosphatase activity can be measured by 
any method of measuring the inorganic phosphate liberated in the reaction. The most commonly 
used detection method used the visible spectroscopic determination of a blue-colored 
phosphomolybdate ammonium complex. 
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EXAMPLES 



GENERAL METHODS 

Procedures for phosphorylations, ligations and transformations are well known in the art. 
Techniques suitable for use in the following examples may be found in Sambrook, J. et at. , 
Molecular Cloning: A Laboratory Manual , Second Edition, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY (1989). 

Materials and methods suitable for the maintenance and growth of bacterial cultures are 
well known in the art. Techniques suitable for use in the following examples may be found as set 
out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. 
Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds), 
American Society for Microbiology, Washington, DC. (1994)) or by Thomas D. Brock in 
Biotechnology: A Textbook of Industrial Microbiology , Second Edition, Sinauer Associates, Inc., 
Sunderland, MA (1989). All reagents and materials used for the growth and maintenance of 
bacterial cells were obtained from Aldrich Chemicals (Milwaukee, Wl), DIFCO Laboratories 
(Detroit, Ml), GIBCO/BRL (Gaithersburg, MD), or Sigma Chemical Company (St. Louis, MO) 
unless otherwise specified. 

The meaning of abbreviations is as follows: "h" means hour(s), "min" means minute(s), 
"sec" means second(s), "d" means day(s), "mL M means milliliters, "L" means liters. 
ENZYME ASSAYS 

Glycerol dehydratase activity in cell-free extracts was determined using 1,2-propanediol 
as substrate. The assay, based on the reaction of aldehydes with methyIbenzo-2-thiazolone 
hydrazone, has been described by Forage and Foster (Biochim. Biophys. Acta, 569, 249 (1979)). 
The activity of 1,3-propanediol oxidoreductase, sometimes referred to as 1,3-propanediol 
dehydrogenase, was determined in solution or in slab gels using 1,3-propanediol and NAD + as 
substrates as has also been described. Johnson and Lin, J. BacterioL, 169, 2050 (1987). NADH 
or NADPH dependent glycerol 3-phosphate dehydrogenase (G3PDH) activity was determined 
spectrophotometrically, following the disappearance of NADH or NADPH as has been described. 
(R. M. Bell and J. E. Cronan, Jr., J. Biol Chem. 250:7153-8 (1975)). 

Honda et al. (1980, In Situ Reactivation of Glycerol-lnactivated Coenzyme B^-Dependent 
Enzymes, Glycerol Dehydratase and Diol Dehydratase. Journal of Bacteriology 143:1458-1465) 
disclose an assay that measures the reactivation of dehydratases. 
Assay for glycerol-3-phosphatase, GPP 

The assay for enzyme activity was performed by incubating the extract with an organic 
phosphate substrate in a bis-Tris or MES and magnesium buffer, pH 6.5. The substrate used 
was l-a-gtycerol phosphate; dj-a-glycerol phosphate. The final concentrations of the reagents in 
the assay are: buffer (20 mM, bis-Tris or 50 mM MES); MgCl2 (10 mM); and substrate (20 mM). 
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If the total protein in the sample was low and no visible precipitation occurs with an acid quench, 
the sample was conveniently assayed in the cuvette. This method involved incubating an enzyme 
sample in a cuvette that contained 20 mM substrate (50 mL, 200 mM), 50 mM MES, 10 mM 
MgCl2, pH 6.5 buffer. The final phosphatase assay volume was 0.5 mL. The enzyme-containing 
sample was added to the reaction mixture; the contents of the cuvette were mixed and then the 
cuvette was placed in a circulating water bath at T = 37 °C for 5 to 120 min - depending on 
whether the phosphatase activity in the enzyme sample ranged from 2 to 0.02 U/mL. The : 
enzymatic reaction was quenched by the addition of the acid molybdate reagent (0.4 mL). After 
the Fiske SubbaRow reagent (0.1 mL) and distilled water (1.5 mL) were added, the solution was 
mixed and allowed to develop. After 1 0 min, the absorbance of the samples was read at 660 nm 
using a Cary 219 UV/Vis spectophotometer. The amount of inorganic phosphate released was 
compared to a standard curve that was prepared by using a stock inorganic phosphate solution 
(0.65 mM) and preparing 6 standards with final inorganic phosphate concentrations ranging from 
0.026 to 0.130 mmol/mL. 
Isolation and Identification 1,3-propanediol 

The conversion of glycerol to 1 ,3-propanediol was monitored by HPLC. Analyses were 
performed using standard techniques and materials available to one skilled in the art of 
chromatography. One suitable method utilized a Waters Maxima 820 HPLC system using UV 
(210 nm) and Rl detection. Samples were injected onto a Shodex SH-1011 column (8 mm x 
300 mm, purchased from Waters. Milford, MA) equipped with a Shodex SH-1011P precolumn 
(6 mm x 50 mm), temperature controlled at 50 °C, using 0.01 N H 2 S0 4 as mobile phase at a flow 
rate of 0.5 mL/min. When quantitative analysis was desired, samples were prepared with a 
known amount of trimethylacetic acid as external standard. Typically, the retention times of 
glycerol (Rl detection), 1,3-propanediol (Rl detection), and trimethylacetic acid (UV and Rl 
detection) were 20.67 min, 26.08 min, and 35.03 min, respectively. 

Production of 1 ,3-propanediol was confirmed by GC/MS. Analyses were performed using 
standard techniques and materials available to one of skill in the art of GC/MS. One suitable 
method utilized a Hewlett Packard 5890 Series II gas chromatograph coupled to a Hewlett 
Packard 5971 Series mass selective detector (El) and a HP-INNOWax column (30 m length. 
0.25 mm i.d.. 0.25 micron film thickness). The retention time and mass spectrum of 
1,3-propanediol generated were compared to that of authentic 1,3-propanediol (m/e: 57, 58). 

An alternative method for GC/MS involved derivatization of the sample. To 1.0 mL of 
sample (e.g., culture supernatant) was added 30 uL of concentrated (70% v/v) perchloric acid. 
After mixing, the sample was frozen and lyophilized. A 1:1 mixture of 

bis(trimethylsilyl)trifluoroacetamide:pyridine (300 uL) was added to the lyophilized material, mixed 
vigorously and placed at 65 °C for one h. The sample was clarified of insoluble material by 
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centrifugation. The resulting liquid partitioned into two phases, the upper of which was used for 
analysis. The sample was chromatographed on a DB-5 column (48 m, 0.25 mm I.D., 0.25 urn 
film thickness; from J&W Scientific) and the retention time and mass spectrum of the 
1,3-propanediol derivative obtained from culture supernatants were compared to that obtained 
from authentic standards. The mass spectrum of TMS-derivatized 1,3-propanediol contains the 
characteristic ions of 205, 177, 130 and 1 15 AMU. 

EXAMPLE 1 

CLONING AND TRANSFORMATION OF E. COLI HOST CELLS WITH COSMIP DNA FOR THE 

EXPRESSION OF 1 ,3-PROPANEDIOL 

Media 

Synthetic S12 medium was used in the screening of bacterial transformants for the ability 
to make 1,3-propanediol. S12 medium contains: 10 mM ammonium sulfate, 50 mM potassium 
phosphate buffer, pH 7.0, 2 mM MgCl2, 0.7 mM CaCl2, 50 uM MnCl2, 1 uM FeCl3, 1 uM ZnCI > 
1.7 uM CUSO4, 2.5 uM C0CI2, 2.4 uM Na2Mo04, and 2 uM thiamine hydrochloride. 

Medium A used for growth and fermentation consisted of: 10 mM ammonium sulfate; 
50 mM MOPS/KOH buffer, pH 7.5; 5 mM potassium phosphate buffer, pH 7.5; 2 mM MgC^; 
0.7 mM CaCl2; 50 uM MnCI 2 ; 1 uM FeC^; 1 uM ZnCI; 1.72 uM CuS0 4 ; 2.53 uM CoC! 2 ; 2.42 uM 
Na2Mo04; 2 uM thiamine hydrochloride; 0.01% yeast extract; 0.01% casamino acids; 0.8 ug/mL 
vitamin B^i an ^ 50 ug/mL amp. Medium A was supplemented with either 0.2% glycerol or 0.2% 
glycerol plus 0.2% D-glucose as required. 
Cells : 

Klebsiella pneumoniae ECL2106 (Ruch et al., J. Bacterid., 124, 348 (1975)), also known 
in the literature as K. aerogenes or Aerobacter aerogenes, was obtained from E. C. C. Lin 
(Harvard Medical School, Cambridge, MA) and was maintained as a laboratory culture. 

Klebsiella pneumoniae ATCC 25955 was purchased from American Type Culture 
Collection (Rockville, MD). 

E. coli DH5a was purchased from Gibco/BRL and was transformed with the cosmid DNA 
isolated from Klebsiella pneumoniae ATCC 25955 containing a gene coding for either a glycerol 
or diol dehydratase enzyme. Cosmids containing the glycerol dehydratase were identified as 
pKP1 and pKP2 and cosmid containing the diol dehydratase enzyme were identified as pKP4. 
Transformed DH5a cells were identified as DH5a-pKP1, DH5a-pKP2, and DH5a-pKP4. 

E. coli ECL707 (Sprenger et al., J. Gen. Microbiol., 135, 1255 (1989)) was obtained from 
E. C. C. Lin (Harvard Medical School, Cambridge, MA) and was similarly transformed with 
cosmid DNA from Klebsiella pneumoniae. These transformants were identified as ECL707-pKP1 
and ECL707-pKP2, containing the glycerol dehydratase gene and ECL707-pKP4 containing the 
diol dehydratase gene. 
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E. coli AA200 containing a mutation in the fp/ gene (Anderson et al., J. Gen Microbiol., 62, 
329 (1970)) was purchased from the E. coli Genetic Stock Center, Yale University (New Haven, 
CT) and was transformed with Klebsiella cosmid DNA to give the recombinant organisms 
AA200-pKP1 and AA200-pKP2, containing the glycerol dehydratase gene, and AA200-pKP4, 
containing the diol dehydratase gene. 
DH5a : 

Six transformation plates containing approximately 1,000 colonies of £ coli XL1-Blue MR 
transfected with K. pneumoniae DNA were washed with 5 mL LB medium and centrifuged. The 
bacteria were pelleted and resuspended in 5 mL LB medium + glycerol. An aliquot (50 uL) was 
inoculated into a 15 mL tube containing S12 synthetic medium with 0.2% glycerol + 400 ng per 
mL of vitamin + 0.001% yeast extract + 50amp. The tube was filled with the medium to the 
top and wrapped with parafilm and incubated at 30 °C. A slight turbidity was observed after 48 h. 
Aliquots, analyzed for product distribution as described above at 78 h and 132 h, were positive for 
1 ,3-propanediol, the later time points containing increased amounts of 1 ,3-propanediol. 

The bacteria, testing positive for 1 ,3-propanediol production, were serially diluted and 
plated onto LB-50amp plates in order to isolate single colonies. Forty-eight single colonies were 
isolated and checked again for the production of 1 ,3-propanediol. Cosmid DNA was isolated from 
6 independent clones and transformed into E. coli strain DH5a. The transformants were again 
checked for the production of 1 ,3-propanediol. Two transformants were characterized further and 
designated as DH5a-pKP1 and DH5a-pKP2. 

A 12.1 kb EcoRl-Sall fragment from pKP1 , subcloned into plBI31 (IBI Biosystem, New 
Haven, CT), was sequenced and termed pHK28-26 (SEQ ID NO:19). Sequencing revealed the 
loci of the relevant open reading frames of the dha operon encoding glycerol dehydratase and 
genes necessary for regulation. Referring to SEQ ID NO:19, a fragment of the open reading 
frame for dhaK encoding dihydroxyacetone kinase is found at bases 1-399; the open reading 
frame dhaD encoding glycerol dehydrogenase is found at bases 983-2107; the open reading 
frame dhaR encoding the repressor is found at bases 2209-4134; the open reading frame dhaT 
encoding 1 ,3-propanediol oxidoreductase is found at bases 5017-6180; the open reading frame 
dhaB1 encoding the alpha subunit glycerol dehydratase is found at bases 7044-871 1 ; the open 
reading frame dhaB2 encoding the beta subunit glycerol dehydratase is found at bases 
8724-9308; the open reading frame dhaB3 encoding the gamma subunit glycerol dehydratase is 
found at bases 931 1-9736; and the open reading frame dhaBX, encoding a protein of unknown 
function is found at bases 9749-11572. 

Single colonies of £ coli XL1-Blue MR transfected with packaged cosmid DNA from 
K. pneumoniae were inoculated into microtiter wells containing 200 uL of S1 5 medium 
(ammonium sulfate, 10 mM; potassium phosphate buffer, pH 7.0, 1 mM; MOPS/KOH buffer, 
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pH 7.0, 50 mM; MgCl2, 2 mM; CaCl2, 0.7 mM; MnCI 2 , 50 uM; FeCI 3 , 1 uM; ZnCI, 1 uM; CuS0 4 , 
1.72 uM; C0CI2, 2.53 uM; Na2MoC>4, 2.42 uM; and thiamine hydrochloride, 2 uM) + 0.2% glycerol 
+ 400 ng/mL of vitamin B-| 2 + 0.001 % yeast extract + 50 ug/mL ampicillin. In addition to the 
microtiter wells, a master plate containing LB-50 amp was also inoculated. After 96 h, 100 uL 
was withdrawn and centrifuged in a Rainin microfuge tube containing a 0.2 micron nylon 
membrane filter. Bacteria were retained and the filtrate was processed for HPLC analysis. 
Positive clones demonstrating 1,3-propanediol production were identified after screening 
approximately 240 colonies. Three positive clones were identified, two of which had grown on 
LB-50 amp and one of which had not. A single colony, isolated from one of the two positive 
clones grown on LB-50 amp and verified for the production of 1,3-propanediol, was designated as 
pKP4. Cosmid DNA was isolated from E. coli strains containing pKP4 and E. coli strain DH5a 
was transformed. An independent transformant, designated as DH5a-pKP4, was verified for the 
production of 1,3-propanediol. 
ECL707 : 

E. coli strain ECL707 was transformed with cosmid K. pneumoniae DNA corresponding to 
one of pKP1, pKP2, pKP4 or the Supercos vector alone and named ECL707-pKP1, 
ECL707-pKP2, ECL707-pKP4, and ECL707-SC, respectively. ECL707 is defective in g/pK, gld, 
and ptsD which encode the ATP-dependent glycerol kinase, NAD^-linked glycerol 
dehydrogenase, and enzyme II for dihydroxyacetone of the phosphoenoipyruvate-dependent 
phosphotransferase system, respectively. 

Twenty single colonies of each cosmid transformation and five of the Supercos vector 
alone (negative control) transformation, isolated from LB-50amp plates, were transferred to a 
master LB-50amp plate. These isolates were also tested for their ability to convert glycerol to 
1 ,3-propanediol in order to determine if they contained dehydratase activity. The transformants 
were transferred with a sterile toothpick to microtiter plates containing 200 uL of Medium A 
supplemented with either 0.2% glycerol or 0.2% glycerol plus 0.2% D-glucose. After incubation 
for 48 hr at 30 °C, the contents of the microtiter plate wells were filtered through an 0.45 micron 
nylon filter and chromatographed by HPLC. The results of these tests are given in Table 1 . 



Table 1 

Conversion of glycerol to 1,3-propanediol by transformed ECL707 

Transformant Glycerol * Glycerol plus Glucose * 

ECL707-pKP1 19/20 19/20 

ECL707-pKP2 18/20 20/20 

ECL707-pKP4 0/20 20/20 

ECL707-SC 0/5 0/5 
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•(Number of positive isolates/number of isolates tested) 
AA200 : 

E. coli strain AA200 was transformed with cosmid K. pneumoniae DNA corresponding to 
one of pKP1, pKP2, pKP4 and the Supercos vector alone and named AA200-pKP1 , AA200-pKP2, 
AA200-pKP4, and AA200-SC, respectively. Strain AA200 is defective in triosephosphate 
isomerase {tpT). 

Twenty single colonies of each cosmid transformation and five of the empty vector 
transformation were isolated and tested for their ability to convert glycerol to 1,3-propanediol as 
described for E. coli strain ECL707. The results of these tests are given in Table 2. 



Table 2 

Conversion of glycerol to 1,3-propanediol by transformed AA200 
Transformant Glycerol * Glycerol plus Glucose * 

AA200-pKP1 17/20 17/20 

AA200-pKP2 17/20 17/20 

AA200-pKP4 2/20 16/20 

AA200-SC 0/5 0/5 

•(Number of positive isolates/number of isolates tested) 

EXAMPLE 2 

CONVERSION OF D-GLUCOSE TO 1 ,3-PROPANEDIOL BY RECOMB INANT E. coli USING 

DAR1, GPP2. dhaB, and dhaT 
Construction of general purpose expression plasmids for use in transform ation of Escherichia coli 
The expression vector pTaclQ 

The £ coli expression vector, pTaclQ, contains the laclq gene (Farabaugh, Nature 274, 
5673 (1978)) and tac promoter (Amann et aL, Gene 25, 167 (1983)) inserted into the EcoRl of 
pBR322 (Sutcliffe et al., Cold Spring Harb. Symp Quant. Biol. 43, 77 (1979)). A multiple cloning 
site and terminator sequence (SEQ ID NO:20) replaces the pBR322 sequence from EcoRl to 
Sphl. 

Subcloning the glycerol dehydratase qenes (dhaB1. 2, 3) 

The open reading frame for dhaB3 gene (incorporating an EcoRl site at the 5' end and a 
Xbal site at the 3' end) was amplified from pHK28-26 by PCR using primers (SEQ ID NOS:21 and 
22). The product was subcloned into pLitmus29 (New England Biolab, Inc., Beverly, MA) to 
generate the plasmid pDHAB3 containing dhaB3. 
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The region containing the entire coding region for the four genes of the dhaB operon from 
pHK28-26 was cloned into pBluescriptil KS+ (Stratagene, La Jolla, CA) using the restriction 
enzymes Kpnl and EcoRI to create the plasmid pM7. 

The dhaBXgene was removed by digesting the plasmid pM7, which contains 
dhaB(1,2,3,4), with Apal and Xbal (deleting part of dhaB3 and a!! of dhaBX). The resulting 5.9 kb 
fragment was purified and ligated with the 325-bp Apal-Xbal fragment from plasmid pDHAB3 
(restoring the dhaB3 gene) to create pM11, which contains dhaB(1,2,3). 

The open reading frame for the dhaB1 gene (incorporating a Hindlll site and a consensus 
RBS ribosome binding site at the 5' end and a Xbal site at the 3' end) was amplified from 
pHK28-26 by PCR using primers (SEQ ID NO:23 and SEQ ID NO;24). The product was 
subcloned into pLitmus28 (New England Biolab, Inc.) to generate the plasmid pDT1 containing 
dhaB1. 

A Notl-Xbal fragment from pM1 1 containing part of the dhaB1 gene, the dhaB2 gene and 
the dhaB3 gene was inserted into pDT1 to create the dhaB expression plasmid, pDT2. The 
Hindlll-Xba! fragment containing the dhaB(1,2,3) genes from pDT2 was inserted into pTaclQ to 
create pDT3. 

Subcloninq the 1 ,3-propanediol dehydrogenase gene (dhaT) 

The Kpnl-Sacl fragment of pHK28-26, containing the complete 1,3-propanediol 
dehydrogenase (dhaT) gene, was subcloned into pBluescriptil KS+ creating plasmid pAH1. The 
dhaT gene (incorporating an Xba! site at the 5' end and a BamHI site at the 3' end) was amplified 
by PCR from pAH1 as template DNA using synthetic primers (SEQ ID NO:25 with SEQ ID 
NO:26). The product was subcloned into pCR-Script (Stratagene) at the Srfl site to generate the 
plasmids pAH4 and pAH5 containing dhaT. The plasmid pAH4 contains the dhaT gene in the 
correct orientation for expression from the lac promoter in pCR-Script and pAH5 contains the 
dhaT gene in the opposite orientation. The Xbal-BamHI fragment from pAH4 containing the dhaT 
gene was inserted into pTaclQ to generate plasmid pAH8. The Hindlll-BamHI fragment from 
pAH8 containing the RBS and dhaT gene was inserted into pBluescriptil KS+ to create pAH11. 
The Hindlli-Sall fragment from pAH8 containing the RBS, dhaT gene and terminator was inserted 
into pBluescriptil SK+ to create pAH12. 

Construction of an expression cassette for dhaB(L2,3) and dhaT 

An expression cassette for the dhaB(1,2,3) and dhaT was assembled from the individual 
dhaB(1,2,3) and dhaT subclones described above using standard molecular biology methods. 
The Spel-Kpnl fragment from pAH8 containing the RBS, dhaT gene and terminator was inserted 
into the Xbal-Kpnl sites of pDT3 to create pAH23. The Smal-EcoRI fragment between the dhaB3 
and dhaT gene of pAH23 was removed to create pAH26. The Spel-Notl fragment containing an 
EcoRI site from pDT2 was used to replace the Spel-Notl fragment of pAH26 to generate pAH27. 
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Construction of expression cassette for dh aT and dhaBd.2.3) 

An expression cassette for dhaT and dhaB(1,2,3) was assembled from the individual 
dhaB(1,2,3) and dhaT subclones described previously using standard molecular biology methods. 
A Spel-Sacl fragment containing the dhaB(1,2,3) genes from pDT3 was inserted into pAH11 at 
the Spel-Sacl sites to create pAH24. 

Cloning and expression of glycerol 3-phosphatase for increased glycerol production in E. coli 

The Saccharomyces cerevisiae chromosome V lamda clone 6592 (Gene Bank, accession 
# U1 881 3x11) was obtained from ATCC. The glycerol 3- phosphate phosphatase (GPP2) gene 
(incorporating an BamHI-RBS-Xbal site at the 5' end and a Smal site at the 3' end) was cloned by 
PCR cloning from the lamda clone as target DNA using synthetic primers (SEQ ID NO:27 with 
SEQ ID NO:28). The product was subcloned into pCR-Script (Stratagene) at the Srfl site to 
generate the plasmids pAH15 containing GPP2. The plasmid pAH15 contains the GPP2 gene in 
the inactive orientation for expression from the lac promoter in pCR-Script SK+. The 
BamHI-Smai fragment from pAH15 containing the GPP2 gene was inserted into pBlueScriptil 
SK + t0 generate plasmid pAH19. The pAH19 contains the GPP2 gene in the correct orientation 
for expression from the lac promoter. The Xbal-Pstl fragment from pAH19 containing the GPP2 
gene was inserted into pPHOX2 to create plasmid pAH21 
Plasmids for the expression of dhaT. dhaB(1.2.3) and GPP2 genes 

A Sali-EcoRI-Xbal linker (SEQ ID NOS:29 and 30) was inserted into pAH5 which was 
digested with the restriction enzymes, Sall-Xbal to create pDT16. The linker destroys the Xba! 
site. The 1 kb Sall-Mlul fragment from pDT16 was then inserted into pAH24 replacing the 
existing Sall-Mlul fragment to create pDT18. 

The 4.1 kb EcoRI-Xbal fragment containing the expression cassette for dhaT and 
dftae(1,2,3) from pDT18 and the 1.0 kb Xbal-Sall fragment containing the GPP2 gene from 
pAH21 was inserted into the vector pMMB66EH (Fuste et al„ GENE, 48, 1 19 (1986)) digested 
with the restriction enzymes EcoRI and Sail to create pDT20. 
Plasmids for the over-expression of DAR1 in E coli 

DAR1 was isolated by PCR cloning from genomic S. cerevisiae DNA using synthetic 
primers (SEQ ID NO:46 with SEQ ID NO:47). Successful PCR cloning places an Nco\ site at the 
5' end of DAR1 where the ATG within Nco\ is the DAR1 initiator methionine. At the 3* end of 
DAR1 a BamH\ site is introduced following the translation terminator. The PCR fragments were 
digested with Nco\ + SamHI and cloned into the same sites within the expression plasmid 
pTrc99A (Pharmacia, Piscataway, New Jersey) to give pDARIA. 

In order to create a better ribosome binding site at the 5' end of DAR 1 , a Spel-RBS-Ncol 
linker obtained by annealing synthetic primers (SEQ ID NO:48 with SEQ ID NO:49) was inserted 
into the Ncol site of pDAR1 A to create pAH40. Plasmid pAH40 contains the new RBS and DAR1 
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gene in the correct orientation for expression from the trc promoter of Trc99A (Pharmacia). The 
Ncol-BamHi fragment from pDAR1 A and a second set of Spel-RBS-Ncol linker obtained by 
annealing synthetic primers (SEQ ID NO:31 with SEQ ID NO:32) was inserted into the 
Spel-BamHI site of pBluescript II-SK+ (Stratagene) to create pAH41 . The construct pAH41 
contains an ampicillin resistance gene. The Ncol-BamHi fragment from pDARIA and a second 
set of Spel-RBS-Ncol linker obtained by annealing synthetic primers (SEQ ID NO:31 with SEQ ID 
NO:32) was inserted into the Spel-BamHI site of pBC-SK+ (Stratagene) to create pAH42. The 
construct pAH42 contains a chloroamphenicol resistance gene. 
Construction of an expression cassette for DAR1 and GPP2 

An expression cassette for DARt and GPP2 was assembled from the individual DAR1 
and GPP2 subclones described above using standard molecular biology methods. The 
BamHI-Pstl fragment from pAH19 containing the RBS and GPP2 gene was inserted into pAH40 
to create pAH43. The BamHI-Pstl fragment from pAH19 containing the RBS and GPP2 gene was 
inserted into pAH41 to create pAH44. The same BamHI-Pstl fragment from pAH19 containing 
the RBS and GPP2 gene was also inserted into pAH42 to create pAH45. 

The ribosome binding site at the 5' end of GPP2 was modified as follows. A BamHI-RBS- 
Spel linker, obtained by annealing synthetic primers 

GATCCAGGAAACAGA with CTAGTCTGTTTCCTG to the Xba!-Pstl fragment from pAH19 
containing the GPP2 gene, was inserted into the BamHI-Pstl site of pAH40 to create pAH48. 
Plasmid pAH48 contains the DAR1 gene, the modified RBS, and the GPP2 gene in the correct 
orientation for expression from the trc promoter of pTrc99A (Pharmacia, Piscataway, N.J.). 
E. coli strain construction 

E. coli W1485 is a wild-type K-12 strain (ATCC 12435). This strain was transformed with 
the plasmids pDT20 and pAH42 and selected on LA (Luria Agar, Difco) plates supplemented with 
50 mg/mL carbencillim and 10 mg/mL chloramphenicol. 
Production of 1 .3-propanediol from Qlucose 

E. coli W1485/pDT20/pAH42 was transferred from a plate to 50 mL of a medium 
containing per liter: 22.5 g glucose, 6.85 g K 2 HP0 4 , 6.3 g (NH 4 ) 2 S0 4 , 0.5 g NaHC0 3 , 2.5 g 
NaCI, 8 g yeast extract, 8 g tryptone, 2.5 mg vitamin B^. 2.5 mL modified Balch's trace-element 
solution, 50 mg carbencillim and 10 mg chloramphenicol, final pH 6.8 (HCI), then filter sterilized. 
The composition of modified Balch's trace-element solution can be found in Methods for General 
and Molecular Bacteriology (P. Gerhardt et a!., eds, p. 158, American Society for Microbiology, 
Washington, DC (1994)). After incubating at 37 °C, 300 rpm for 6 h, 0.5 g glucose and IPTG (final 
concentration = 0.2 mM) were added and shaking was reduced to 100 rpm. Samples were 
analyzed by GC/MS. After 24 h, W1485/pDT20/pAH42 produced 1.1 g/L glycerol and 195 mg/L 
1, 3-propanediol. 
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EXAMPLE 3 

CLONING AND EXPRESSION OF dhaB AND dhaT 
IN Saccharomvces cerevisiae 
Expression plasmids that could exist as replicating episomal elements were constructed 
for each of the four dha genes. For all expression plasmids a yeast ADH1 promoter was present 
and separated from a yeast ADH1 transcription terminator by fragments of DNA containing 
recognition sites for one or more restriction endonucleases. Each expression plasmid also 
contained the gene for b-lactamase for selection in E. coli on media containing ampicillin, an 
origin of replication for plasmid maintenance in £. coli, and a 2 micron origin of replication for 
maintenance in S. cerevisiae. The selectable nutritional markers used for yeast and present on 
the expression plasmids were one of the following: HIS3 gene encoding 
imidazoleglycerolphosphate dehydratase, URA3 gene encoding orotidine 5'-phosphate 
decarboxylase, TRP1 gene encoding N-(5'-phosphoribosyl)-anthranilate isomerase, and LEU2 
encoding b-isopropylmalate dehydrogenase. 

The open reading frames for dhaT, dhaB3 t dhaB2 and dhaB1 were amplified from . 
pHK28-26 (SEQ ID NO:19) by PGR using primers (SEQ ID NO:38 with SEQ ID NO:39, SEQ ID 
NO:40 with SEQ ID NO:41, SEQ ID NO:42 with SEQ ID NO:43, and SEQ ID NO:44 with SEQ ID 
NO:45 for dhaT, dhaB3 y dhaB2 and dhaB1, respectively) incorporating EcoR1 sites at the 5' ends 
(10 mM Tris pH 8.3, 50 mM KCI, 1.5 mM MgCI 2 , 0.0001% gelatin, 200 mM dATP, 200 mM dCTP, 
200 mM dGTP, 200 mM dTTP, 1 mM each primer, 1-10 ng target DNA, 25 units/mL Amplitaqa 
DNA polymerase (Perkin-Elmer Cetus, Nowalk CT)). PCR parameters were 1 min at 94 °C, 
1 min at 55 °C; 1 min at 72 °C, 35 cycles. The products were subcloned into the EcoR1 site of 
pHIL-D4 (Phillips Petroleum, Bartlesville, OK) to generate the plasmids pMP13, pMP14, pMP20 
and pMP15 containing dhaT dhaB3, dhaB2 and dhaB1< respectively. 
Construction of dhaB1 expression plasmid pMCKIO 

The 7.8 kb replicating plasmid pGADGH (Clontech, Palo Alto, CA) was digested with 
Hindlll, dephosphorylated, and ligated to the dhaB1 Hindlll fragment from pMP15. The resulting 
plasmid (pMCKIO) had dhaB1 correctly oriented for transcription from the ADH1 promoter and 
contained a LEU2 marker. 

Construction of dha B2 expression plasmid PMCK17 

Plasmid pGADGH (Clontech, Palo Alto, CA) was digested with Hindlll and the single- 
strand ends converted to EcoRI ends by ligation with Hindlll-Xmnl and EcoRI-Xmnl adaptors 
(New England Biolabs, Beverly, MA). Selection for plasmids with correct EcoRI ends was 
achieved by ligation to a kanamycin resistance gene on an EcoRI fragment from plasmid pUC4K 
(Pharmacia Biotech, Uppsala), transformation into E. coli strain DH5a and selection on LB plates 
containing 25 mg/mL kanamycin. The resulting plasmid (pGAD/KAN2) was digested with SnaBI 
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and EcoRI and a 1 .8 kb fragment with the ADH1 promoter was isolated. Plasmid pGBT9 
(Clontech, Palo Alto, CA) was digested with SnaBI and EcoRI, and the 1.5 kb ADH1/GAL4 
fragment replaced by the 1.8 kb ADH1 promoter fragment isolated from pGAD/KAN2 by digestion 
with SnaBI and EcoRI. The resulting vector (pMCK1 1) is a replicating plasmid in yeast with an 
ADH1 promoter and terminator and a TRP1 marker. Plasmid pMCK11 was digested with EcoRI, 
dephosphorylated, and ligated to the dhaB2 EcoRI fragment from pMP20. The resulting plasmid 
(pMCK17) had dhaB2 correctly oriented for transcription from the ADH1 promoter and contained 
a TRP1 marker. 

Construction of dhaB3 expression plasmid pMCK30 

Plasmid pGBT9 (Clontech) was digested with Nael and Pvull and the 1 kb TRP1 gene 
removed from this vector. The TRPl gene was replaced by a URA3 gene donated as a 1.7 kb 
Aatll/Nael fragment from plasmid pRS406 (Stratagene) to give the intermediary vector pMCK32. 
The truncated ADH1 promoter present on pMCK32 was removed on a 1.5 kb SnaBI/EcoRI 
fragment, and replaced with a full-length ADH1 promoter on a 1 .8 kb SnaBI/EcoRI fragment from 
plasmid pGAD/KAN2 to yield the vector pMCK26. The unique EcoRI site on pMCK26 was used 
to insert an EcoRI fragment with dhaB3 from plasmid pMP14 to yield pMCK30. The pMCK30 
replicating expression plasmid has dhaB3 orientated for expression from the ADH1 promoter, and 
has a URA3 marker. 

Construction of dhaT expression plasmid pMCK35 

Plasmid pGBT9 (Clontech) was digested with Nael and Pvull and the 1 kb TRP1 gene 
removed from this vector. The TRPl gene was replaced by a HIS3 gene donated as an 
Xmnl/Nael fragment from plasmid pRS403 (Stratagene) to give the intermediary vector pMCK33. 
The truncated ADH1 promoter present on pMCK33 was removed on a 1.5 kb SnaBI/EcoRI 
fragment, and replaced with a full-length ADH1 promoter on a 1.8 kb SnaBI/EcoRI fragment from 
plasmid pGAD/KAN2 to yield the vector pMCK31 . The unique EcoRI site on pMCK31 was used 
to insert an EcoRI fragment with dhaTirom plasmid pMP13 to yield pMCK35. The pMCK35 
replicating expression plasmid has dha T orientated for expression from the ADH1 promoter, and 
has a HIS3 marker. 

Transformation of S. cerevisiae with dha expression plasmids 

S. cerevisiae strain YPH500 (ura3-52 Iys2-801 ade2-101 trp1-D63 his3-D200 Ieu2-D1) 
(Sikorski R. S. and Hieter P., Genetics 122, 19-27, (1989)) purchased from Stratagene (La Jolla, 
CA) was transformed with 1-2 mg of plasmid DNA using a Frozen-EZ Yeast Transformation Kit 
(Catalog #T2001) (Zymo Research, Orange, CA). Colonies were grown on Supplemented 
Minimal Medium (SMM - 0.67% yeast nitrogen base without amino acids, 2% glucose) for 3-4 d at 
29 °C with one or more of the following additions: adenine sulfate (20 mg/L), uracil (20 mg/L), 
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L-tryptophan (20 mg/L), L-histidine (20 mg/L), L-leucine (30 mg/L), L-lysine (30 mg/L). Colonies 
were streaked on selective plates and used to inoculate liquid media. 
Screening of S. cerevisiae transformants for dha genes 

Chromosomal DNA from URA + , HIS + , TRP+, LEU + transformants was analyzed by PCR 
using primers specific for each gene (SEQ ID NOS:38-45). The presence of all four open reading 
frames was confirmed. 

Expression of dhaB and dhaT activity in transformed S. cerevisiae 

The presence of active glycerol dehydratase (dhaB) and 1,3-propanediol oxido-reductase 
(dhaT) was demonstrated using in vitro enzyme assays. Additionally, western blot analysis 
confirmed protein expression from all four open reading frames. 

Strain YPH500, transformed with the group of plasmids pMCKIO, pMCK17, pMCK30 and 
pMCK35, was grown on Supplemented Minimal Medium containing 0.67% yeast nitrogen base 
without amino acids 2% glucose 20 mg/L adenine sulfate, and 30 mg/L L-lysine. Cells were 
homogenized and extracts assayed for dhaB activity. A specific activity of 0.12 units per mg 
protein was obtained for glycerol dehydratase, and 0.024 units per mg protein for 1,3-propanediol 
oxido-reductase. 

EXAMPLE 4 

PRODUCTION OF 1,3-PROPANEDIOL FROM D-GLUCOSE 
USING RECOMBINANT Saccharomyces cerevisiae 
S. cerevisiae YPH500, harboring the groups of plasmids pMCKIO, pMCK17, pMCK30 
and pMCK35, was grown in a BiostatB fermenter (B Braun Biotech, Inc.) in 1.0 L of minimal 
medium initially containing 20 g/L glucose, 6.7 g/L yeast nitrogen base without amino acids, 
40 mg/L adenine sulfate and 60 mg/L L-lysine*HCI. During the course of the growth, an additional 
equivalent of yeast nitrogen base, adenine and lysine was added. The fermenter was controlled 
at pH 5.5 with addition of 10% phosphoric acid and 2 M NaOH, 30 °C, and 40% dissolved oxygen 
tension through agitation control. After 38 h, the cells (ODeoo = 5 8 AU ) were harvested by 
centrifugation and resuspended in base medium (6.7 g/L yeast nitrogen base without amino 
acids, 20 mg/L adenine sulfate, 30 mg/L L-lysine'HCI, and 50 mM potassium phosphate buffer, 
pH7.0). 

Reaction mixtures containing cells (ODeoo = 20 AU ) in a total v °l ume of 4 m L of base 
media supplemented with 0.5% glucose, 5 ug/mL coenzyme B 12 and 0, 10, 20, or 40 mM 
chloroquine were prepared, in the absence of light and oxygen (nitrogen sparging), in 10 mL 
crimp sealed serum bottles and incubated at 30 °C with shaking. After 30 h, aliquots were 
withdrawn and analyzed by HPLC. The results are shown in the Table 3. 
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Table 3 



Production of 1 ,3-propanediol using recombinant S. cerevisiae 

chloroquine 1 ,3-propanediol 

reaction (mM) (mM) 



1 



0 



0.2 



2 



10 



0.2 



3 



20 



0.3 



4 



40 



0.7 



EXAMPLE 5 

USE OF A S. cerevisiae DOUBLE TRANSFORM ANT FOR PRODUCTION 
OF 1 ,3-PROPANEDIOL FROM D-GLUCOSE WHERE dhaB AND dhaT ARE 
INTEGRATED INTO THE GENOME 

Example 5 prophetically demonstrates the transformation of S. cerevisiae with dhaB1, 
dhaB2, dhaB3, and dhaT and the stable integration of the genes into the yeast genome for the 
production of 1 ,3-propanedio! from glucose. 
Construction of expression cassettes 

Four expression cassettes (dhaB1 t dhaB2, dhaB3 % and dhaT) are constructed for glucose- 
induced and high-level constitutive expression of these genes in yeast, Saccharomyces 
cerevisiae. These cassettes consist of: (i) the phosphoglycerate kinase (PGK) promoter from 
S. cerevisiae strain S288C; (ii) one of the genes dhaB1 , dhaB2, dhaB3> or dhaT\ and (iii) the PGK 
terminator from S. cerevisiae strain S288C. The PCR-based technique of gene splicing by 
overlap extension (Horton et al., BioTechniques, 8:528-535, (1990)) is used to recombine DNA 
sequences to generate these cassettes with seamless joints for optimal expression of each gene. 
These cassettes are cloned individually into a suitable vector (pLITMUS 39) with restriction sites 
amenable to multi-cassette cloning in yeast expression plasmids. 
Construction of yeast integration vectors 

Vectors used to effect the integration of expression cassettes into the yeast genome are 
constructed. These vectors contain the following elements: (i) a polycloning region into which 
expression cassettes are subcloned; (ii) a unique marker used to select for stable yeast 
transformants; (iii) replication origin and selectable marker allowing gene manipulation in E. coii 
prior to transforming yeast. One integration vector contains the URA3 auxotrophic marker 
(Ylp352b), and a second integration vector contains the LYS2 auxotrophic marker (pKP7). 
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Construction of yeast expression plasmids 

Expression cassettes for dhaB1 and dhaB2 are subcioned into tJje polycloning region of 
the Ylp352b (expression plasmid #1), and expression cassettes for dhaB3 and dhaT are 
subcioned into the polycloning region of pKP7 (expression plasmid #2). 
Transformation of yeast with expression plasmids 

S. cerevisiae (ura3, Iys2) is transformed with expression plasmid #1 using Frozen-EZ 
Yeast Transformation kit (Zymo Research, Orange, CA), and transformants selected on plates 
lacking uracil. Integration of expression cassettes for dhaB1 and dhaB2 is confirmed by PCR 
analysis of chromosomal DNA. Selected transformants are re-transformed with expression 
plasmid #2 using Frozen-EZ Yeast Transformation kit, and double transformants selected on 
plates lacking lysine. Integration of expression cassettes for dhaB3 and dhaT is confirmed by 
PCR analysis of chromosomal DNA. The presence of all four expression cassettes (dhaBt 
dhaB2, dhaB3, dhaT) in double transformants is confirmed by PCR analysis of chromosomal 
DNA. 

Protein production from double-transformed yeast 

Production of proteins encoded by dhaBh dhaB2, dhaB3 and dhaT from double- 
transformed yeast is confirmed by Western bloLanaiysis. 
Enzyme activity from double-transformed yeast 

Active glycerol dehydratase and active 1,3-propanediol dehydrogenase from double- 
transformed yeast is confirmed by enzyme assay as described in General Methods above. 
Production of 1 ,3-propanediol from double-transformed yeast 

Production of 1,3-propanediol from glucose in double-transformed yeast is demonstrated 
essentially as described in Example 4. 

EXAMPLE 6 

CONSTRUCTION OF PLASMIDS CONTAINING DAR1/GPP2 
OR dhaT/dhaB1-3 AND TRANSFORMATION INTO KLEBSIELLA SPECIES 
K. pneumoniae (ATCC 25955), K. pneumoniae (ECL2106), and K. oxytoca (ATCC 8724) 
are naturally resistant to ampicillin (up to 150 ug/mL) and kanamycin (up to 50 ug/mL), but 
sensitive to tetracycline (10 ug/mL) and chloramphenicol (25 ug/mL). Consequently, replicating 
plasmids which encode resistance to these latter two antibiotics are potentially useful as cloning 
vectors for these Klebsiella strains. The wild-type K. pneumoniae (ATCC 25955), the glucose- 
derepressed k. pneumonia (ECL2106), and K. oxytoca (ATCC 8724) were successfully 
transformed to tetracycline resistance by electroporation with the moderate-copy-number plasmid, 
pBR322 (New England Biolabs, Beverly, MA). This was accomplished by the following 
procedure: Ten mL of an overnight culture was inoculated into 1 L LB (1% (w/v) Bacto-tryptone 
(Difco, Detroit, Mi), 0.5% (w/v) Bacto-yeast extract (Difco) and 0.5% (w/v) NaCI (Sigma, St. Louis, 
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MO) and the culture was incubated at 37 °C to an OD500 of 0.5-0.7. The cells were chilled on 
ice, harvested by centrifugation at 4000 x g for 15 min, and resuspended in 1 L ice-cold sterile 
10% glycerol. The cells were repeatedly harvested by centrifugation and progressively 
resuspended in 500 mL, 20 mL and, finally, 2 mL ice-cold sterile 10% glycerol. For 
electroporation, 40 uL of cells were mixed with 1-2 uL DNA in a chilled 0.2 cm cuvette and were 
pulsed at 200 CI, 2.5 kV for 4-5 msec using a BioRad Gene Pulser (BioRad, Richmond, CA). 
One mL of SOC medium (2% (w/v) Bacto-tryptone (Difco), 0.5% (w/v) Bacto-yeast extract (Difco), 
10 mM NaCI, 10 mM MgCl2, 10 mM MgS0 4 , 2.5 mM KCI and 20 mM glucose) was added to the 
cells and, after the suspension was transferred to a 17 x 100 mm sterile polypropylene tube, the 
culture was incubated for 1 hr at 37 °C, 225 rpm. Aliquots were plated on selective medium, as 
indicated. Analyses of the plasmid DNA from independent tetracycline-resistant transformants 
showed the restriction endonuclease digestion patterns typical of pBR322, indicating that the 
vector was stably maintained after overnight culture at 37 °C in LB containing tetracycline 
(10 ug/mL). Thus, this vector, and derivatives such as pBR329 (ATCC 37264) which encodes 
resistance to ampiciliin, tetracycline and chloramphenicol, may be used to introduce the 
DAR1/GPP2 and dhaT/dhaB1-3 expression cassettes into K. pneumoniae and K. oxytoca. 

The DAR1 and GPP2 genes may be obtained by PCR-mediated amplification from the 
Saccharomyces cerevisiae genome, based on their known DNA sequence. The genes are then 
transformed into K. pneumoniae or K. oxytoca under the control of one or more promoters that 
may be used to direct their expression in media containing glucose. For convenience, the genes 
were obtained on a 2.4 kb DNA fragment obtained by digestion of plasmid pAH44 with the Pvull 
restriction endonuclease, whereby the genes are already arranged in an expression cassette 
under the control of the E. coli lac promoter. This DNA fragment was ligated to Pvu//-digested 
pBR329, producing the insertional inactivation of its chloramphenicol resistance gene. The 
ligated DNA was used to transform E. coli DH5a (Gibco, Gaithersberg, MD). Transformants were 
selected by their resistance to tetracycline (10 ug/mL) and were screened for their sensitivity to 
chloramphenicol (25 ug/mL). Analysis of the plasmid DNA from tetracycline-resistant, 
chloramphenicol-sensitive transformants confirmed the presence of the expected plasmids, in 
which the P\ BC -dar1-gpp2 expression cassette was subcloned in either orientation into the 
pBR329 Pvull site. These plasmids, designated pJSPIA (clockwise orientation) and pJSPIB 
(counterclockwise orientation), were separately transformed by electroporation into K. pneumonia 
(ATCC 25955), K. pneumonia (ECL2106) and K. oxytoca (ATCC 8724) as described. 
Transformants were selected by their resistance to tetracycline (10 ug/mL) and were screened for 
their sensitivity to chloramphenicol (25 ug/mL). Restriction analysis of the plasmids isolated from 
independent transformants showed only the expected digestion patterns, and confirmed that they 
were stably maintained at 37 °C with antibiotic selection. The expression of the DAR1 and GPP2 
genes may be enhanced by the addition of IPTG (0.2-2.0 mM) to the growth medium. 
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The four K. pneumoniae dhaB(1-3) and dhaT genes may be obtained by PCR-mediated 
amplification from the K. pneumoniae genome, based on their known DNA sequence. These 
genes are then transformed into K. pneumoniae under the control of one or more promoters that 
may be used to direct their expression in media containing glucose. For convenience, the genes 
were obtained on an approximately 4.0 kb DNA fragment obtained by digestion of plasmid pAH24 
with the Kpnl/Sacl restriction endonucleases, whereby the genes are already arranged in an 
expression cassette under the control of the E. coli lac promoter. This DNA fragment was ligated 
to similarly digested pBC-KS+ (Stratagene, LaJolla, CA) and used to transform £. coli DH5oc. 
Transformants were selected by their resistance to chloramphenicol (25 ug/mL) and were 
screened for a white colony phenotype on LB agar containing X-gal. Restriction analysis of the 
plasmid DNA from chloramphenicol-resistant transformants demonstrating the white colony 
phenotype confirmed the presence of the expected plasmid, designated pJSP2, in which the 
dhaT-dhaB(1-3) genes were subcloned under the control of the £. coli lac promoter. 

To enhance the conversion of glucose to 1 ,3-propanediol, this plasmid was separately 
transformed by electroporation into K. pneumoniae (ATCC 25955) (pJSPIA), K. pneumoniae 
(ECL2106) (pJSP1 A) and K. oxytoca (ATCC 8724) (pJSP1 A) already containing the 
Pl 3C -dar1-gpp2 expression cassette. Cotransformants were selected by their resistance to both 
tetracycline (10 ug/mL) and chloramphenicol (25 ug/mL). Restriction analysis of the plasmids 
isolated from independent cotransformants showed the digestion patterns expected for both 
pJSP1 A and pJSP2. The expression of the DAR1, GPP2< dhaB(1-3), and ctoaTgenes may be 
enhanced by the addition of IPTG (0.2-2.0 mM) to the medium. 

EXAMPLE 7 

Production of 1 .3 propanediol from glucose by K. pneumoniae 
Klebsiella pneumoniae strains ECL 2105 and 2106-47, both transformed with pJSPIA, 
and ATCC 25955, transformed with pJSPIA and pJSP2, were grown in a 5 L Applikon fermenter 
under various conditions (see Table 4) for the production of 1 ,3-propanediol from glucose. Strain 
2104-47 is a fiuoroacetate-tolerant derivative of ECL~2106 which was obtained from a 
fluoroacetate/lactate selection plate as described in Bauer et a)., AppL Environ. Microbiol. 56, 
1296 (1990). In each case, the medium used contained 50-100 mM potassium phosphate buffer, 
pH 7.5, 40 mM (NH 4 )2S0 4 , 0.1% (w/v) yeast extract, 10 pM CoCI 2 , 6.5 pM CuCI 2 , 100 pM 
FeCI 3 , 18 pM FeS0 4 , 5 pM H 3 B0 3 . 50 pM MnCI 2 , 0.1 pM Na 2 Mo0 4 , 25 pM ZnCI 2 , 0.82 mM 
MgS0 4 , 0.9 mM CaCI 2 , and 10-20 g/L glucose. Additional glucose was fed, with residual glucose 
maintained in excess. Temperature was controlled at 37 °C and pH controlled at 7.5 with 5N 
KOH or NaOH. Appropriate antibiotics were included for plasmid maintenance; IPTG 
(isopropyl-b-D-thiogalactopyranoside) was added at the indicated concentrations as well. For 
anaerobic fermentations, 0.1 vvm nitrogen was sparged through the reactor; when the dO 
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setpoint was 5%, 1 vvm air was sparged through the reactor and the medium was supplemented 
with vitamin B12. Final concentrations and overall yields (g/g) are shown in Table 4. 

Table 4 

Production of 1,3 propanediol from glucose by K. pneumoniae 



Organism 


(JO 


IPTG, 
mM ' 


vitamin B12, 
mg/L 


Tiler, g/L 


Yield, 

g/g 


25955[pJSP1A/pJS 
P2] 


0 


0.5 


0 


8.1 


16% 


25955[pJSP1A/pJS 
P2] 


5% 


0.2 


0.5 


5.2 


4% 


2106[pJSP1A] 


0 


0 


0 


4.9 


17% 


2106[pJSP1A] 


5% 


0 


5 


6.5 


12% 


2105-47[pJSP1A] 


5% 


0.2 


0.5 


10.9 


12% 



EXAMPLE 8 

Conversion of carbon substrates to 1 ,3-propanediol by recombinant 
K. pneumoniae containing darl, qpp2, dftaB. and dhal 
A. Conversion of D-fructose to 1 ,3-propanediol by various K. pneumoniae recombinant strains: 
Single colonies of K. pneumoniae (ATCC 25955 pJSP1 A), K. pneumoniae (ATCC 25955 
pJSP1A/pJSP2), K. pneumoniae (ATCC 2106 pJSPIA), and K. pneumoniae (ATCC 2106 
pJSP1 A/pJSP2) were transferred from agar plates and in separate culture tubes were 
subcultured overnight in Luria-Bertani (LB) broth containing the appropriate antibiotic agent(s). A 
50-mL flask containing 45 mL of a steri-filtered minima! medium defined as LLMM/F which 
contains per liter: 10 g fructose; 1 g yeast extract; 50 mmoles potassium phosphate, pH 7.5; 
40 mmoles (NH 4 )2S0 4 ; 0.09 mmoles calcium chloride; 2.38 mg CoCl 2 -6H 2 0; 0.88 mg 
CuCI 2 *2H 2 0; 27 mg FeCI 3 *6H 2 0; 5 mg FeSO 4 *7H 2 0; 0.31 mg H3BO3; 10 mg MnCI 2 -4H 2 0; 
0.023 mg Na 2 MoO 4 »2H 2 0; 3.4 mg ZnCI 2 ; 0.2 g MgSO4*7H 2 0. Tetracycline at 10 ug/mL was 
added to medium for reactions using either of the single plasmid recombinants; 10 ug/mL 
tetracycline and 25 ug/mL chloramphenicol for reactions using either of the double plasmid 
recombinants. The medium was thoroughly sparged with nitrogen prior to inoculation with 2 mL 
of the subculture. IPTG (I) at final concentration of 0.5 mM was added to some flasks. The flasks 
were capped, then incubated at 37 °C, 100 rpm in a New Brunswick Series 25 incubator/shaker. 
Reactions were run for at least 24 hours or until most of the carbon substrate was converted into 
products. Samples were analyzed by HPLC. Table 5 describes the yields of 1,3-propanediol 
(3G) produced from fructose by the various Klebsiella recombinants. 
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Table 5 

Production of 1 t 3-propanediol from D-fructose using recombinant Klebsiella 



Klebsiella Strain 


Medium 


Conversio 
n 


I3G] 

(g/L) 


Yield Carbon (%) 


2106 pBR329 


LLMM/F 


100 


0 


U 


2106 pJSPIA 


LLMM/F 


50 


O.ob 


1 c c 

1 O.O 


2106 pJSPIA 


LLMM/F + I 


100 




1 A 


2106 


LLMM/F 


58 


r\ off 

0.26 


c 
O 


pJSP1A/pJSP2 










25955 pBR329 


LLMM/F 


100 


0 


0 


25955 pJSPIA 


LLMM/F 


100 


0.3 


4 


25955 pJSPIA 


LLMM/F + I 


100 


0.15 


2 


25955 


LLMM/F 


100 


0.9 


11 


pJSP1A/pJSP2 










25955 


LLMM/F + I 


62 


1.0 


20 


pJSP1A/pJSP2 











B. Conversion of various carbon substrates to 1 ,3-propanediol by K. pneumoniae (ATCC 25955 
pJSP1A/pJSP2): 

An aliquot (0.1 mL) of frozen stock cultures of K. pneumoniae (ATCC 25955 
pJSP1 A/pJSP2) was transferred to 50 mL Seed medium in a 250 mL baffled flask. The Seed 
medium contained per liter: 0.1 molar NaK/P0 4 buffer, pH 7.0; 3 g (NH 4 ) 2 S0 4 ; 5 g glucose, 
0.15 g MgSO 4 «7H 2 0, 10 mL 100X Trace Element solution. 25 mg chloramphenicol, 10 mg 
tetracycline, and 1 g yeast extract. The 100X Trace Element contained per liter: 10 g citric acid, 
1.5 g CaCI 2 -2H 2 0, 2.8 g FeSO 4 -7H 2 0, 0.39 g ZnSO 4 -7H 2 0, 0.38 g CuSO 4 -5H 2 0, 0.2 g 
CoCI 2 «6H 2 0, and 0.3 g MnCI 2 «4H 2 0. The resulting solution was titrated to pH 7.0 with either 
KOH or H 2 S0 4 . The glucose, trace elements, antibiotics and yeast extracts were sterilized 
separately. The seed inoculum was grown overnight at 35 °C and 250 rpm. 

The reaction design was semi-aerobic. The system consisted of 130 mL Reaction 
medium in 125 mL sealed flasks that were left partially open with aluminum foil strip. The 
Reaction Medium contained per liter: 3 g (NH 4 ) 2 S0 4 ; 20 g carbon substrate; 0.15 molar 
NaK/P0 4 buffer, pH 7.5; 1 g yeast extract; 0.15 g MgSO 4 «7H 2 0; 0.5 mmoles IPTG; 10 mL 100X 
Trace Element solution; 25 mg chloramphenicol; and 10 mg tetracycline. The resulting solution 
was titrated to pH 7.5 with KOH or H 2 S0 4 . The carbon sources were: D-glucose (Glc); 
D-fructose (Frc); D-lactose (Lac); D-sucrose (Sue); D-maltose (Mai); and D-mannitol (Man). A 
few glass beads were included in the medium to improve mixing. The reactions were initiated by 
addition of seed inoculum so that the optical density of the cell suspension started at 0.1 AU as 
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measured at I500 nm - The flasks were incubated at 35 °C: 250 rpm. 3G production was 
measured by HPLC after 24 hr. Table 6 describes the yields of 1,3-propanediol produced from 
the various carbon substrates. 



Table 6 

Production of 1,3-propanediol from various carbon substrates 
using recombinant Klebsiella 25955 pJSP1A/pJSP2 



Carbon Substrate 


1 ,3-Propanediol (g/L) 


Expt. 1 


Expt. 2 


Expt 3 


Glc 


0.89 


1 


1.6 


Frc 


0.19 


0.23 


0.24 


Lac 


0.15 


0.58 


0.56 


Sue 


0.88 


0.62 




Mai 


0.05 


0.03 


0.02 


Man 


0.03 


0.05 


0.04 



EXAMPLE 9 

IMPROVEMENT OF 1 ,3-PROPANEDIOL PRODUCTION USING dhaBX GENE 

Example 9 demonstrates the improved production of 1 ,3-propanediol in E.coli when a 
gene encoding a protein X is introduced. 
Construction of expression vector pTaclQ 

The E. coli expression vector, pTaclQ containing the laclq gene (Farabaugh, P.J. 1978, 
Nature 274 (5673) 765-769) and tac promoter (Amann et al, 1983, Gene 25, 167-178) was 
inserted into the restriction endonuclease site EcoRI of pBR322 (Sutcliffe, 1979, Cold Spring 
Harb. Symp. Quant. Biol. 43, 77-90). A multiple cloning site and terminator sequence (SEQ ID 
NO:50) replaces the pBR322 sequence from EcoRI to Sphl. 
Subcloninq the Qlycerol dehydratase genes ( dhaB1 ,2.3, X) 

The region containing the entire coding region for Klebsiella dhaB1 t dhaB2, dhaB3 and 
dhaBX of the dhaB operon from pHK28-26 was cloning into pBluescriptllKS+(Stratagene) using 
the restriction enzymes Kpnl and EcoRI to create the plasmid pM7. 

The open reading frame for dhaB3 gene was amplified from pHK 28-26 by PCR using 
primers (SEQ ID NO:51 and SEQ ID NO:52) incorporating an EcoRI site at the 5* end and a Xbal 
site at the 3' end. The product was subcloned into pLitmus29(NEB) to generate the plasmid 
pDHAB3 containing dhaB3. 

The cfrtaSXgene was removed by digesting plasmid pM7 with Apal and Xbal, purifying the 
5.9 kb fragment and ligating it with the 325-bp Apal-Xbal fragment from plasmid pDHAB3 to 
create pM11 containing dhaB1, dhaB2 and dhaB3. 
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The open reading frame for the dhaB1 gene was amplified from pHK28-26 by PCR using 
primers (SEQ ID NO:53 and SEQ ID NO:54) incorporating Hindlll site and a consensus ribosome 
binding site at the 5' end and a Xbal site at the 3' end. The product was subcloned into 
pLitmus28(NEB) to generate the plasmids pDT1 containing dhaB1 . 

A Notl-Xbal fragment from pM11 containing part of the dhaB1 gene, the dhaB2 gene and 
the dhaB3 gene with inserted into pDT1 to create the dhaB expression plasmid, pDT2. The 
HinDIII-Xbal fragment containing the dhaB(1,2,3) genes from pDT2 was inserted into pTaclQ.to 
create pDT3. 

Subclonina the TMG dehydroge nase oene ( dhaT) 

The Kpnl-Sacl fragment of pHK28-26, containing the TMG dehydrogenase (dhaT) gene, 
was subcloned into pBluescriptll KS+ creating plasmid pAH1. The dhaT gene was cloned by PCR 
from pAH1 as template DNA and synthetic primers (SEQ ID NO:55 with SEQ ID NO:56) 
incorporating an Xbal site at the 5' end and a BamHI site at the 3' end. The product was 
subcloned into pCR-Script(Stratagene) at the Srfl site to generate the plasmids pAH4 and pAH5 
containing dhaT. The pAH4 contains the dhaT gene in the right orientation for expression from the 
lac promoter in pCR-Script and pAH5 contains dhaT gene in the opposite orientation. The Xbal- 
BamHI fragment from pHA4 containing the dhaT gene was inserted into pTaclQ to generate 
plasmid, pAH8. The Hindll-BamHI fragment from pAH8 containing the RBS and dhaT gene was 
inserted into pBluescriptllKS+ to create pAH11. 
Construction of an expression cassette for dhaT an d dhaB(1.2.3) 

An expression cassette for dhaT and dhaB(1,2,3) was assembled from the individual 
dhaB(1,2,3) and dhaT subclones described previously using standard molecular biology methods. 
A Spel-Sacl fragment containing the dhaB(1,2,3) genes from pDT3 was inserted into pAH11 at 
the Spel-Sacl sites to create pAH24. A Sall-Xbal linker (SEQ ID NO 57and SEQ ID NO 58) was 
inserted into pAH5 which was digested with the restriction enzymes Sall-Xbal to create pDT16. 
The linker destroys the Xbal site. The 1 kb Sall-Mlul fragment from pDT16 was then inserted into 
pAH24 replacing the existing Sall-Mlul fragment to create pDT18. 
Plasmid for the over-expression of dhaT and dhaB d. 2. 3. X) in E. coli 

The 4.4 kb Notl-Xbal fragment containing part of the dhaB1 gene, dhaB2, dhaB3 and 
dhaBX from plasmid pM7 was purified and- ligated with the 4.1 Kb Notl-Xbal fragment from 
plasmid pDT18 (restoring dhaB1) to create pM33 containing the dhaB1. dhaB2, dhaB3 and 
dhaBX. 
E. coli strain 

E. coli DH5a was obtained from BRL (Difco). This strain was transformed with the 
plasmids pM7, pM11, pM33 or pDt18 and selected on LA plates containing 100 ug/ml 
carbenicillin. 
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Production of 1.3-propanediol 

E. coli DH5a, containing plasmid pM7, pM11, pM33 or pDT18 was grown on LA plates 
plus 100 ug/ml carbeniciliin overnight at 37°C. One colony from each was used to inoculate 25 ml 
of media (0.2 M KH 2 P04, citric acid 2.0 g/L, MgS04*7H20 2.0 g/L, H2S04 (98%) 1.2 ml/L, 
Ferric ammonium citrate 0.3 g/L, CaCI2*2H20 0.2 gram, yeast extract 5 g/L, glucose 10 g/L, 
glycerol 30 g/L,) plus Vitamine B12 0.005 g/L, 0.2 mM IPTG, 200 ug/ml carbeniciliin and 5 ml 
modified Balch's trace-element solution (the composition of which can be found in Methods for 
General and Molecular Bacteriology (P. Gerhardt et el., eds, p 158, American Society for 
Microbiology, Washington, DC 1994), final pH 6.8 (NH40H), then filter-sterilized in 250 ml 
erlenmeyers flasks. The shake flasks were incubated at 37°C with shaking (300 rpm) for several 
days, during which they were sampled for HPLC analysis by standard procedures. Final yields 
are shown in Table 4. 

Overall, as shown in Table 7, the results indicate that the expression of dhaBX in 
plasmids expressing dhaB(1,2,3) or dhaT-dhaB(1,2,3) greatly enhances the production of 1,3- 
propanediol . 

TABLE 7 

Effect of dhaBX expression on the production of 1 ,3-propanediol by E. coli 



Strain _ Time (days) 1 ,3-propanediol (mq/U* 



DH5a/pM7 (dhaB1 ,2,3,X) 


1 


1500 




2 


2700 


DH5a/pM11 (dhaB1,2,3) 


1 


< 200 ng 




2 


< 200 tag 


DH5a/pM33 (dhaT-dhaB1,2,3,X) 


2 


1200 


DH5a/pDT18 (dhaT-dhaB1,2,3) 


2 


88 



* Expressed as an average from several experiments. 
Primers: 

SEQ ID NO: 50- MCS-TERMINATOR: 

5 AGCTTAGGAGTCTAGAATATTGAGCTCGAATTCCCGGGCATGCGGTACCGGATCCAGAAAA 
AAGCCCGCACCTGACAGTGCGGGC I I I I I I I I I 3" 

SEQ ID NO: 51 -dhaB3-5' end. EcoRI 
GGAATTCAGATCTCAGCAATGAGCGAGAAAACCATGC 

SEQ ID NO 52: dhaB3-3' end Xbal 
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G CT CT AG ATT AG CTTC CTTT A C G C AG C 

SEQ ID NO 53: cf/?aB1 5' end-Hindlll-SD 

5' G G CC AAG CTT AAG G AGGTT AATTAAATG AAAAG 3' 

SEQ ID NO 54: dhaBI 3' end-Xbal 

5* GCTCTAGATTATTCAATGGTGTCGGG 3' 

SEQ ID NO 55: dhaT 5' end-Xbal 

5' GCGCCGTCTAGAATTATGAGCTATCGTATGTTTGATTATCTG 3' 
SEQ ID NO 56: dhaiy end-BamHI 

5' TCTGATACGGGATCCTCAGAATGCCTGGCGGAAAAT 3' 

SEQ ID NO 57: pUSH Linker!: 
5' TCGACGAATTCAGGAGGA 3' 

SEQ ID NO 58: pUSH Linker2: 
5' CTAGTCCTCCTGAATTCG 3' 

EXAMPLE 10 
Reactivation of the Glycerol Dehy dratase Activity 

Example 10 demonstrates the in vivo reactivation of the glycerol dehydratase activity in 
microorganisms containing at least one gene encoding protein X. 

Plasmids pM7 and pM11 were constructed as described in Example 9 and transformed 

into E.coli DH5a celts. The transformed cells were cultured and assayed for the production of 1 ,3- 

propanediol according to the method of Honda et al. (1980, In Situ Reactivation of Glycerol- 

Inactivated Coenzyme Bi 2 -Dependent Enzymes, Glycerol Dehydratase and Diol Dehydratase. 

Journal of Bacteriology 143:1458-1465). 

Materials and methods 

Toluenization of Cells 

The cells were grown to mid-log phase and were harvested by centrifugation at room 
temperature early in growth, i.e. 0.2 > OD 6 oo <0.8. The harvested cells were washed 2x in 50mM 
KP0 4 pH8.0 at room temperature. The cells were resuspended to OD 6 oo 20-30 in 50mM KPO4 
pH8.0. The absolute OD is not critical. A lower cell mass is resuspend in less volume. If 
coenzyme B12 is added at this point, the remainder of the steps are performed in the dark. 



WO 98/21341 PCT/US97/20873 

- 50 - 



Toluene is added to 1% final volume of cell suspension and the suspension is shaked vigorously 
for 5 minutes at room temperature. The suspension is centrifuged to pellet the cells. The cells 
are washed 2x in 50mM KP0 4 pH8.0 at room temperature (25mls each). The cell pellet is 
resuspended in the same volume as was used prior to toluene addition and transfer to fresh 
tubes. The OD 6 oo for the toiuenized cells was measured and recorded and stored at 4 degrees C. 
Whole Cell Glycerol Dehydratase Assay 

The toluene treated cells were assayed at 37 degrees C for the presence of dehydratase 
activity. Three sets of reactions were carried out as shown below: no ATP, ATP added at 0 time, 
and ATP added at 10 minutes. 

No ATP: 100ul 2M Glycerol 

100ul 150uMCoBi 2 

700ul Buffer (0.03M KPQ 4 / 0.5M KCl, pH8.0) 



T=0 minute ATP 



100ul 
100ul 
600ul 
100u! 



2M Glycerol 
150uM C0B12 

Buffer (0.03M KPO4 / 0.5M KCl, pH8.0) 
30mM ATP/ 30mM MnCI 2 



T=10 minute ATP 



100ul 
100ul 
700ul 



2M Glycerol 
150uM C0B12 

Buffer (0.03M KPO4 / 0.5M KCl, pH8.0) 



Controls were prepared for each of the above conditions by adding 100uls buffer instead of 
CoB 12 . The tubes were mixed. 50uls MBTH (3-Methyl-2-Benzo-Thiazolinone Hydrazone) (6 
mg/ml in 375mM Glycine / HCI pH2.7) was added to each of these tubes and continue incubation 
in ice water. The reaction tubes were placed in a 37 degree C water bath for a few minutes to 
equilibrate to 37 degree C. A tube containing enough toiuenized cells for all assay tubes was 
placed into the 37 degree C water bath for a few minutes to equilibrate to 37 degree C. A tube 
containing 2.5 fold diluted (in assay buffer) 30mM ATP/ 30mM MnCI 2 (12mM each) was placed 
into the 37 degree C water bath for a few minutes to equilibrate to 37 degree C. A 100ul cell 
suspension was added to all tubes and samples were taken at 0,1,2,3,4,5,10,15,20 and 30 
minutes. At every timepoint, 100uls of reaction was withdrawn and immediately added to 50uls 
ice cold MBTH, vortexed, and placed in an ice water bath. At T=10 minutes, a sample was 
withdrawn and added to MBTH, then 100uls of the 2.5 fold diluted ATP/Mn was added as fast as • 
is possible. When all samples were collected, the sample tube rack was added to a boiling water 
bath and boiled for three minutes. The tubes were chilled in an ice water bath for 30 seconds. 
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500uls of freshly prepared 3.3 mg/ml FeCI3.6H20. was added to the tubes and the tubes 
vortexed. The tubes were incubated at room temperature for 30 minutes, diluted 10x in H20, and 
then centrifuged to collect the cells and particulates. The absorbance was measured at 670nM 
and the cells were diluted to keep OD under 1.0. 

Example of Calculation of Activity 
The observed OD670 was multiplied by the dilution factor to determine absorbance. The blank 
absorbance was substracted for that reaction series and the TO A670nM was substracted. The - 
absolute A670nM was divided by 53.4 (mM extinction coefficient for 30H-propioaldehyde) and 
the mM concentration was multiplied by any dilution of reaction during timecourse. Because 1 ml 
reaction was used, the concentration (umoles/ml) of 30H-propionaldehyde was divided by the 
mgs dry weight used in the assay (calculated via OD600 and 10D 600 = 0.436 mgs dry weight) to 
get umoles aldehyde per mg dry weight cells. 

Results 

As shown in Figure 6, whole E.coli cells were assayed for reactivation of glycerol 
dehydratase in the absence and presence of added ATP and Mn++. The results indicate that cells 
containing a plasmid carrying dhaB 1, 2 and 3 as well as protein X have the ability to reactivate 
catalytically inactivated glycerol dehydrogenase. Cells containing protein 1, protein 2 and protein 
3 have increased ability to reactivate the catalytically inactivated glycerol dehydratase. 

As shown in Figure 7, whole E.coli cells were assayed for reactivation of glycerol- 
inactivated glycerol dehydratase in the absence and in the presence of added ATP and Mn++. 
The results show that cells containing dhaB subunits 1 , 2 and 3 and X have the ability to reactivate 
catalytically inactivated glycerol dehydratase. Cell lacking the protein X gene do not have the 
ability to reactivate the catalytically inactivated glycerol dehydratase. 

Figures 9 and 10 illustrate that host cells containing plasmid pHK 28-26 (Figure 1), when 
cultured under conditions suitable for the production of 1,3-propanediol, produced more 1,3- 
propanediol than host cells transformed with pDT24 and cultured under conditions suitable for the 
production of 1,3-propanediol. Plasmid pDT24 is a derivative of pDT18 (described in Example 9) 
and contains dhaT, dhaB 1, 2, 3 and protein X. but lacks proteins 1, 2 and 3. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: MARIA DIAZ -TORRES 

NIGEL DUNN- COLEMAN 
MATTHEW CHASE 



TITLE OF INVENTION: METHOD FOR THE 
RECOMBINANT PRODUCTION OF 1,3 PROPANEDIOL 

NUMBER OF SEQUENCES: 4 9 

CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: GENENCOR INTERNATIONAL, INC. 

(B) STREET: 4 CAMBRIDGE PLACE 
187 0 SOUTH WINTON ROAD 

(C) CITY: ROCHESTER 

(D) STATE: NEW YORK 

(E) COUNTRY: U.S.A. 

(F) POSTAL CODE (ZIP) : 14618 

COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 3.50 INCH DISKETTE 

(B) COMPUTER: IBM PC COMPATIBLE 

(C) OPERATING SYSTEM: MICROSOFT WINDOWS 3.1 

(D) SOFTWARE: MICROSOFT WORD 2 . OC 

CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 11/13/97 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/030,601 

(B) FILING DATE: 11/13/96 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: GLAI STER, DEBRA 

(B) REGISTRATION NO. : 33,888 

(C) REFERENCE/DOCKET NUMBER: GC 369-2 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 650-864-7620 

(B) TELEFAX: 650-845-6504 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1668 base pairs 



(ii) 

(iii) 
(iv) 

(v) 
(vi) 



WO 98/21341 PCT/US97/20873 

- 53 - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHABI 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



AT G AAAAG AT 


CAAAA C GAT T 


TGCAGTACTG 


GCCCAGCGCC 


CCGTCAATCA 


GGACGGGCTG 


60 


ATTGGCGAGT 


GGCCTGAAGA 


GGGGCTGATC 


GCCATGGACA 


GCCCCTTTGA 


CCCGGTCTCT 


120 


TCAGTAAAAG 


TGGACAACGG 


TCTGATCGTC 


GAACTGGACG 


GCAAACGCCG 


GGACCAGTTT 


180 


GACATGATCG 


ACCGATTTAT 


CGCCGATTAC 


GCGATCAACG 


TTGAGCGCAC 


AGAGCAGGCA 


240 


ATGCGCCTGG 


AGGCGGTGGA 


AATAGCCCGT 


ATGCTGGTGG 


ATATTCACGT 


CAGCCGGGAG 


300 


GAG AT CAT T G 


C CATC ACT AC 


CGCCATCACG 


CCGGCCAAAG 


CGGTCGAGGT 


GATGGCGCAG 


360 


ATGAACGTGG 


T G GAG AT GAT 


GATGGCGCTG 


CAGAAGATGC 


GTGCCCGCCG 


GACCCCCTCC 


420 


AACCAGTGCC 


ACGTCACCAA 


TCTCAAAGAT 


AATCCGGTGC 


AGATTGCCGC 


TGACGCCGCC 


480 


GAGGCCGGGA 


TCCGCGGCTT 






TCGGTATCGC 


GCGCTACGCG 


540 


CCGTTTAACG 


CCCTGGCGCT 


GTTGGTCGGT 


TCGCAGTGCG 


GCCGCCCCGG 


CGTGTTGACG 


600 


CAGTGCTCGG 


TGGAAGAGGC 


CACCGAGCTG 


GAGCTGGGCA 


TGCGTGGCTT 


AACCAGCTAC 


660 


GCCGAGACGG 


TGTCGGTCTA 


CGGCACCGAA 


GCGGTATTTA 


CCGACGGCGA 


TGATACGCCG 


720 


TGGTCAAAGG 


CGTTCCTCGC 


CTCGGCCTAC 


GCCTCCCGCG 


GGTTGAAAAT 


GCGCTACACC 


780 


TCCGGCACCG 


GATCCGAAGC 


GCTGATGGGC 


TATTCGGAGA 


GCAAGTCGAT 


GCTCTACCTC 


840 


GAATCGCGCT 


GCATCTTCAT 


TACTAAAGGC 


GCCGGGGTTC 


AGGGACTGCA 


AAACGGCGCG 


900 


GTGAGCTGTA 


TCGGCATGAC 


CGGCGCTGTG 


CCGTCGGGCA 


TTCGGGCGGT 


GCTGGCGGAA 


960 


AACCTGATCG 


CCTCTATGCT 


CGACCTCGAA 


GTGGCGTCCG 


CCAACGACCA 


GACTTTCTCC 


1020 


CACTCGGATA 


TTCGCCGCAC 


CGCGCGCACC 


CTGATGCAGA 


TGCTGCCGGG 


CACCGACTTT 


1080 


ATTTTCTCCG 


GCTACAGCGC 


GGTGCCGAAC 


TACGACAACA 


TGTTCGCCGG 


CTCGAACTTC 


1140 


GATGCGGAAG 


ATTTTGATGA 


TTACAACATC 


CTGCAGCGTG 


ACCTGATGGT 


TGACGGCGGC 


1200 


CTGCGTCCGG 


TGACCGAGGC 


G G AAAC CAT T 


GCCATTCGCC 


AGAAAGCGGC 


GCGGGCGATC 


1260 
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CAGGCGGTTT TCCGCGAGCT GGGGCTGCCG CCAATCGCCG AC GAG GAG G T GGAGGCCGCC 132 0 

ACCTACGCGC ACGGCAGCAA CGAGATGCCG CCGCGTAACG TGGTGGAGGA TCTGAGTGCG 1380 

GTGGAAGAGA TGATGAAGCG CAACATCACC GGCCTCGATA TTGTCGGCGC GCTGAGCCGC 144 0 

AGCGGCTTTG AG GAT AT C G C CAGCAATATT CTCAATATGC TGCGCCAGCG GGTCACCGGC 1500 

GATTACCTGC AGACCTCGGC CATTCTCGAT CGGCAGTTCG AGGTGGTGAG TGCGGTCAAC 1560 

GACATCAATG ACTATCAGGG GCCGGGCACC GGCTATCGCA TCTCTGCCGA ACGCTGGGCG 1620 

GAG AT C AAAA ATATTCCGGG CGTGGTTCAG CCCGACACCA TTGAATAA 1668 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 585 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHAB2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

GTGCAACAGA CAACCCAAAT TCAGCCCTCT TTTACCCTGA AAACCCGCGA GGGCGGGGTA 60 

GCTTCTGCCG ATGAACGCGC CGATGAAGTG GTGATCGGCG TCGGCCCTGC CTTCGATAAA 12 0 

CACCAGCATC ACACTCTGAT CGATATGCCC CATGGCGCGA TCCTCAAAGA GCTGATTGCC 180 

GGGGTGGAAG AAGAGGGGCT TCACGCCCGG GTGGTGCGCA TTCTGCGCAC GTCCGACGTC 24 0 

TCCTTTATGG CCTGGGATGC GGCCAACCTG AGCGGCTCGG GGATCGGCAT CGGTATCCAG 300 

TCGAAGGGGA CCACGGTCAT CCATCAGCGC GATCTGCTGC CGCTCAGCAA CCTGGAGCTG 360 

TTCTCCCAGG CGCCGCTGCT GACGCTGGAG ACCTACCGGC AGATTGGCAA AAACGCTGCG 42 0 

CGCTATGCGC GCAAAGAGTC ACCTTCGCCG GTGCCGGTGG TGAAC GATCA GATGGTGCGG 4 80 

CCGAAATTTA TGGCCAAAGC CGCGCTATTT CAT AT C AAAG AGACCAAACA TGTGGTGCAG 54 0 

GACGCCGAGC CCGTCACCCT GCACATCGAC TTAGTAAGGG AGTGA 58 5 

(2) INFORMATION FOR SEQ ID NO: 3: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 426 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHAB3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AT GAG C GAGA AAACCATGCG CGTGCAGGAT TATCCGTTAG CCACCCGCTG CCCGGAGCAT 60 

ATCCTGACGC CTACCGGCAA ACCATTGACC GATATTACCC TCGAGAAGGT GCTCTCTGGC 12 0 

GAGGTGGGCC CGCAGGATGT GCGGATCTCC CGCCAGACCC TTGAGTACCA GGCGCAGATT 180 

GCCGAGCAGA TGCAGCGCCA TGCGGTGGCG CGCAATTTCC GCCGCGCGGC GGAGCTTATC 24 0 

GCCATTCCTG AC GAG C G CAT TCTGGCTATC TATAACGCGC TGCGCCCGTT CCGCTCCTCG 300 

CAGGCGGAGC TGCTGGCGAT CGCCGACGAG CTGGAGCACA CCTGGCATGC GACAGTGAAT 3 60 

GCCGCCTTTG TCCGGGAGTC GGCGGAAGTG TATCAGCAGC GGCATAAGCT GCGTAAAGGA 420 

426 

AGCTAA 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHAT 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATGAGCTATC GTATGTTTGA TTATCTGGTG CCAAACGTTA ACTTTTTTGG CCCCAACGCC 60 

ATTTCCGTAG TCGGCGAACG CTGCCAGCTG CTGGGGGGGA AAAAAGCCCT GCTGGTCACC 120 

GACAAAGGCC TGCGGGCAAT T AAA GAT GG C GCGGTGGACA AAACCCTGCA TTATCTGCGG 180 

GAGGCCGGGA TCGAGGTGGC GATCTTTGAC GGCGTCGAGC CGAACCCGAA AGACACCAAC 24 0 

GTGCGCGACG GCCTCGCCGT GTTTCGCCGC GAACAGTGCG ACATCATCGT CACCGTGGGC 300 

GGCGGCAGCC CGCACGATTG CGGCAAAGGC ATCGGCATCG CCGCCACCCA TGAGGGCGAT 360 

CTGTACCAGT ATGCCGGAAT CGAGACCCTG ACCAACCCGC TGCCGCCTAT CGTCGCGGTC 42 0 

AATACCACCG CCGGCACCGC CAGCGAGGTC ACCCGCCACT GCGTCCTGAC CAACACCGAA 4 80 
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ACCAAAGTGA AGTTTGTGAT CGTCAGCTGG CGCAAACTGC CGTCGGTCTC TAT CAAC GAT 54 0 

CCACTGCTGA TGATCGGTAA ACCGGCCGCC CTGACCGCGG CGACCGGGAT GGATGCCCTG 600 

ACCCACGCCG TAGAGGCCTA TATCTCCAAA GACGCTAACC CGGTGACGGA CGCCGCCGCC 660 

ATGCAGGCGA TCCGCCTCAT CGCCCGCAAC CTGCGCCAGG CCGTGGCCCT CGGCAGCAAT 72 0 

CTGCAGGCGC G G GAAAAC AT GGCCTATGCT TCTCTGCTGG CCGGGATGGC TTTCAATAAC 780 

GCCAACCTCG GCTACGTGCA CGCCATGGCG CACCAGCTGG GCGGCCTGTA CGACATGCCG 84 0 

CACGGCGTGG CCAACGCTGT CCTGCTGCCG CATGTGGCGC GCTACAACCT GAT C G C CAAC 90 0 

CCGGAGAAAT TCGCCGATAT CGCTGAACTG ATGGGCGAAA ATATCACCGG ACTGTCCACT 9 60 

CTCGACGCGG CGGAAAAAGC CATCGCCGCT ATCACGCGTC TGTCGATGGA TAT C G GT AT T 102 0 

CCGCAGCATC TGCGCGATCT GGGGGTAAAA GAGGCCGACT TCCCCTACAT GGCGGAGATG 1080 

GCTCTAAAAG ACGGCAATGC GTTCTCGAAC CCGCGTAAAG GCAACGAGCA GGAGATTGCC 114 0 

GCGATTTTCC GCCAGGCATT CTGA 1164 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPD1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CTTTAATTTT CTTTTATCTT ACTCTCCTAC ATAAGACATC AAGAAACAAT TGTATATTGT 60 

ACACCCCCCC CCTCCACAAA CACAAATATT GATAATATAA AGATGTCTGC TGCTGCTGAT 120 

AGATTAAACT TAACTTCCGG CCACTTGAAT GCTGGTAGAA AGAGAAGTTC CTCTTCTGTT 180 

TCTTTGAAGG CTGCCGAAAA GCCTTTCAAG GTTACTGTGA TTGGATCTGG TAACTGGGGT 24 0 

ACTACTATTG CCAAGGTGGT TGCCGAAAAT TGTAAGGGAT AC CCAGAAGT TTTCGCTCCA 300 

ATAGTACAAA TGTGGGTGTT CGAAGAAGAG ATCAATGGTG AAAAATTGAC T G AAAT CAT A 360 

AATACTAGAC ATCAAAACGT GAAATACTTG CCTGGCATCA CTCTACCCGA CAATTTGGTT 420 

GCTAATCCAG ACTTGATTGA TTCAGTCAAG GATGTCGACA TCATCGTTTT CAACATTCCA 4 80 
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CATCAATTTT TGCCCCGTAT CTGTAGCCAA TTGAAAGGTC ATGTTGATTC ACACGTCAGA 54 0 

GCTATCTCCT GTCTAAAGGG TTTTGAAGTT GGTGCTAAAG GTGTCCAATT GCTATCCTCT 600 

TACATCACTG AGGAACTAGG TATTCAATGT GGTGCTCTAT CTGGTGCTAA CATTGCCACC 660 

GAAGTCGCTC AAGAACACTG GTCTGAAACA ACAGTTGCTT ACCACATTCC AAAGGATTTC 720 

AGAGGCGAGG GCAAGGACGT CGACCATAAG GTTCTAAAGG CCTTGTTCCA CAGACCTTAC 78 0 

TTCCACGTTA GTGTCATCGA AGATGTTGCT GGTATCTCCA TCTGTGGTGC TTTGAAGAAC 84 0 

GTTGTTGCCT TAGGTTGTGG TTTCGTCGAA GGTCTAGGCT GGGGTAACAA CGCTTCTGCT 900 

GCCATCCAAA GAGTCGGTTT GGGTGAGATC AT C AG ATT C G GTCAAATGTT TTTCCCAGAA 960 

TCTAGAGAAG AAACATACTA CCAAGAGTCT GCTGGTGTTG CTGATTTGAT CACCACCTGC 102 0 

GCTGGTGGTA GAAACGTCAA GGTTGCTAGG CTAATGGCTA CTTCTGGTAA GGACGCCTGG 108 0 

GAATGTGAAA AGGAGTTGTT GAATGGCCAA TCCGCTCAAG GTTTAATTAC CTGCAAAGAA 114 0 

GTTCACGAAT GGTTGGAAAC ATGTGGCTCT GTCGAAGACT TCCCATTATT TGAAGCCGTA 120 0 

TACCAAATCG TTTACAACAA CTACCCAATG AAGAACCTGC CGGACATGAT T GAAGAAT T A 1260 

GATCTACATG AAGATTAGAT TTATTGGAGA AAGATAACAT ATCATACTTC CCCCACTTTT 132 0 

TTCGAGGCTC TTCTATATCA T ATT CAT AAA TTAGCATTAT GTCATTTCTC ATAACTACTT 138 0 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2946 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPD2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GAATTCGAGC CTGAAGTGCT GATTACCTTC AGGTAGACTT CATCTTGACC CATCAACCCC 60 

AGCGTCAATC C T G CAAAT AC ACCACCCAGC AGCACTAGGA TGATAGAGAT AATATAGTAC 120 

GTGGTAACGC TTGCCTCATC ACCTACGCTA TGGCCGGAAT CGGCAACATC CCTAGAATTG 180 

AGTACGTGTG ATCCGGATAA CAACGGCAGT GAATATATCT TCGGTATCGT AAAGATGTGA 24 0 

TATAAGATGA T GT AT AC C C A AT GAG GAG C G CCTGATCGTG ACCTAGACCT TAGTGGCAAA 300 

AACGACATAT CTATTATAGT GGGGAGAGTT TCGTGCAAAT AACAGACGCA GCAGCAAGTA 360 
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ACT GTGACGA TATCAACTCT TTTTTTATTA TGTAATAAGC AAACAAGCAC GAATGGGGAA 42 0 

AGCCTATGTG CAATCACCAA GGTCGTCCCT TTTTTCCCAT TTGCTAATTT AGAATTTAAA 4 80 

GAAACCAAAA GAATGAAGAA AGAAAACAAA TACTAGCCCT AACCCTGACT TCGTTTCTAT 54 0 

GATAATACCC TGCTTTAATG AACGGTATGC CCTAGGGTAT ATCTCACTCT GT AC GTTACA 600 

AACTCCGGTT ATTTTATCGG AACATCCGAG CACCCGCGCC TTCCTCAACC CAGGCACCGC 660 

CCCAGGTAAC CGTGCGCGAT GAGCTAATCC TGAGCCATCA CCCACCCCAC CCGTTGATGA 72 0 

CAGCAATTCG GGAGGGCGAA AATAAAACTG GAGCAAGGAA TT AC CATC AC CGTCACCATC 7 80 

ACCATCATAT CGCCTTAGCC TCTAGCCATA GCCATCATGC AAGCGTGTAT CTTCTAAGAT 84 0 

TCAGTCATCA TCATTACCGA GTTTGTTTTC CTTCACATGA TGAAGAAGGT TTGAGTATGC 9 00 

TCGAAACAAT AAGACGACGA TGGCTCTGCC ATT GGT TATA TTACGCTTTT GCGGCGAGGT 9 60 

GCCGATGGGT TGCTGAGGGG AAGAGTGTTT AGCTTACGGA CCTATTGCCA TTGTTATTCC 102 0 

GATTAATCTA TTGTTCAGCA GCTCTTCTCT ACCCTGTCAT TCTAGTATTT TTTTTTTTTT" 108 0 

TTTTTGGTTT TACTTTTTTT TCTTCTTGCC TTTTTTTCTT GTTACTTTTT TTCTAGTTTT 114 0 

TTTTCCTTCC ACTAAGCTTT TTCCTTGATT TATCCTTGGG TTCTTCTTTC TACTCCTTTA 1200 

GATTTTTTTT T TAT AT AT T A ATTTTTAAGT TTATGTATTT TGGTAGATTC AATTCTCTTT 1260 

CCCTTTCCTT TTCCTTCGCT CCCCTTCCTT ATCAATGCTT GCTGTCAGAA GATTAACAAG 132 0 

ATACACATTC CTTAAGCGAA CGCATCCGGT GTTATATACT CGTCGTGCAT ATAAAATTTT 1380 

GCCTTCAAGA TCTACTTTCC TAAGAAGATC ATTATTACAA ACACAACTGC ACTCAAAGAT 144 0 

GACTGCTCAT AC T AAT AT C A AACAGCACAA ACACTGTCAT GAG G AC CATC CTATCAGAAG 1500 

ATCGGACTCT GCCGTGTCAA TTGTACATTT GAAACGTGCG CCCTTCAAGG TTACAGTGAT 1560 

TGGTTCTGGT AACTGGGGGA CCACCATCGC CAAAGTCATT GCGGAAAACA CAGAATTGCA 162 0 

TTCCCATATC TTCGAGCCAG AGGTGAGAAT GTGGGTTTTT GATGAAAAGA TCGGCGACGA 168 0 

AAATCTGACG GAT AT CAT AA ATACAAGACA CCAGAACGTT AAATATCTAC C C AAT AT T G A 17 4 0 

CCTGCCCCAT AATCTAGTGG CCGATCCTGA TCTTTTACAC TCCATCAAGG GTGCTGACAT 18 00 

CCTTGTTTTC AACATCCCTC AT C AAT T TT T ACCAAACATA GTCAAACAAT TGCAAGGCCA 1860 

CGTGGCCCCT CATGTAAGGG CCATCTCGTG TCTAAAAGGG TTCGAGTTGG GCTCCAAGGG 192 0 

TGTGCAATTG CTATCCTCCT ATGTTACTGA TGAGTTAGGA ATCCAATGTG GCGCACTATC 1980 

TGGTGCAAAC TTGGCACCGG AAGTGGCCAA G GAG C ATT GG TCCGAAACCA CCGTGGCTTA 204 0 
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CCAACTACCA AAGGATTATC AAGGTGATGG CAAGGATGTA GATCATAAGA TTTTGAAATT 2100 

GCTGTTCCAC AGACCTTACT TCCACGTCAA TGTCATCGAT GATGTTGCTG GT AT AT C CAT 2160 

TGCCGGTGCC TTGAAGAACG TCGTGGCACT TGCATGTGGT TTCGTAGAAG GTATGGGATG 222 0 

GGGTAACAAT GCCTCCGCAG CCATTCAAAG GCTGGGTTTA GGT GAAATTA TCAAGTTCGG 22 8 0 

TAGAATGTTT TTCCCAGAAT CCAAAGTCGA GACCTACTAT CAAGAATCCG CTGGTGTTGC 234 0 

AGATCTGATC ACCACCTGCT CAGGCGGTAG AAACGTCAAG GTTGCCACAT ACATGGCCAA 24 00 

GACCGGTAAG TCAGCCTTGG AAGCAGAAAA GGAATTGCTT AACGGTCAAT CCGCCCAAGG 24 60 

GATAAT CAC A TGCAGAGAAG TTCACGAGTG GCTACAAACA TGTGAGTTGA CCCAAGAATT 2520 

CCCAATTATT CGAGGCAGTC TACCAGATAG TCTACAACAA CGTCCGCATG GAAGAC CT AC 25 8 0 

C G GAG AT GAT TGAAGAGCTA GACATCGATG ACGAATAGAC ACTCTCCCCC CCCCTCCCCC 264 0 

TCTGATCTTT CCTGTTGCCT CTTTTTCCCC CAACCAATTT AT CAT TAT AC ACAAGTT CTA 27 00 

CAACTACTAC TAGTAACATT ACTACAGTTA TTATAATTTT CTATTCTCTT TTTCTTTAAG 27 60 

AAT CT AT CAT TAACGTTAAT TTCTATATAT ACATAACTAC CATTATACAC G CT ATT AT C G 2 82 0 

TTTACATATC ACATCACCGT TAAT GAAAGA TACGACACCC TGTACACTAA CACAATTAAA 28 80 

TAATCGCCAT AACCTTTTCT GTTATCTATA GCCCTTAAAG CTGTTTCTTC GAGCTTTTCA 2940 

CTGCAG 2946 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3178 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GUT 2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

CTGCAGAACT TCGTCTGCTC TGTGCCCATC CTCGCGGTTA GAAAGAAGCT GAATTGTTTC 60 

ATGCGCAAGG GCATCAGCGA GT G AC C AAT A ATCACTGCAC TAATTCCTTT TTAGCAACAC 12 0 

ATACTTATAT AC AG CAC CAG ACCTTATGTC TTTTCTCTGC TCCGATACGT TATCCCACCC 180 

AACTTTTATT TCAGTTTTGG CAGGGGAAAT TTCACAACCC CGCACGCTAA AAATCGTATT 24 0 
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TAAACTTAAA 


AGAGAACAGC 


CACAAATAGG 


GAACTTTGGT 


CTAAACGAAG 


GACTCTCCCT 


300 


CCCTTATCTT 


GACCGTGCTA 


TTGCCATCAC 


TGCTACAAGA 


CTAAATACGT 


ACTAATATAT 


360 


GTTTTCGGTA 


ACGAGAAGAA 


GAGCTGCCGG 


TGCAGCTGCT 


GCCATGGCCA 


CAGCCACGGG 


420 


GACGCTGTAC 


TGGATGACTA 


GCCAAGGTGA 


TAGGCCGTTA 


GTGCACAATG 


ACCCGAGCTA 


480 


CATGGTGCAA 


TTCCCCACCG 


CCGCTCCACC 


GGCAGGTCTC 


TAG AC GAG AC 


CTGCTGGACC 


540 


GTCTGGACAA 


GACGCATCAA 


TTCGACGTGT 


TGATCATCGG 


TGGCGGGGCC 


ACGGGGACAG 


600 


GATGTGCCCT 


AGATGCTGCG 


ACCAGGGGAC 


TCAATGTGGC 


CCTTGTTGAA 


AAGGGGGATT 


660 


TTGCCTCGGG 


AACGTCGTCC 


AAATCTACCA 


AG AT GAT T C A 


CGGTGGGGTG 


CGGTACTTAG 


720 


AGAAGGCCTT 


CTGGGAGTTC 


TCCAAGGCAC 


AACTGGATCT 


GGT CATC GAG 


GCACTCAACG 


780 


AGCGTAAACA 


TCTTATCAAC 


ACTGCCCCTC 


ACCTGTGCAC 


GGTGCTACCA 


ATTCTGATCC 


840 


CCATCTACAG 


CACCTGGCAG 


GTCCCGTACA 


TCTATATGGG 


CTGTAAATTC 


TACGATTTCT 


SOO 


TTGGCGGTTC 


C CAAAACTT G 


AAAAAATCAT 


ACCTACTGTC 


CAAATCCGCC 


ACCGTGGAGA 


960 


AGGCTCCCAT 


GCTTACCACA 


GACAATTTAA 


AGGCCTCGCT 


TGTGTACCAT 


GATGGGTCCT 


1020 


TTAACGACTC 


GCGTTTGAAC 


GCCACTTTAG 


CCATCACGGG 


TGTGGAGAAC 


GGCGCTACCG 


1080 


TCTTGATCTA 


TGTCGAGGTA 


CAAAAATTGA 


TCAAAGACCC 


AACTTCTGGT 


AAGGTTATCG 


1140 


GTGCCGAGGC 


CCGGGACGTT 


GAGACTAATG 


AGCTTGTCAG 


AATCAACGCT 


AAATGTGTGG 


1200 


TCAATGCCAC 


GGGCCCATAC 


AGTGACGCCA 


TTTTGCAAAT 


GGACCGCAAC 


CCATCCGGTC 


1260 


TGCCGGACTC 


CCCGCTAAAC 


GACAACTCCA 


AGATCAAGTC 


GACTTTCAAT 


CAAATCTCCG 


1320 


TCATGGACCC 


GAAAATGGTC 


AT C C CAT CT A 


TTGGCGTTCA 


CATCGTATTG 


CCCTCTTTTT 


1380 


ACTCCCCGAA 


GGATATGGGT 


TTGTTGGACG 


TCAGAACCTC 


TGATGGCAGA 


GTGATGTTCT 


1440 


TTTTACCTTG 


GCAGGGCAAA 


GTCCTTGCCG 


GCACCACAGA 


CATCCCACTA 


AAGCAAGTCC 


1500 


CAGAAAACCC 


TATGCCTACA 


GAGGCTGATA 


TT CAAGAT AT 


CTTGAAAGAA 


CTACAGCACT 


1560 


ATATCGAATT 


CCCCGTGAAA 


AGAGAAGACG 


TGCTAAGTGC 


ATGGGCTGGT 


GTCAGACCTT 


1620 


TGGTCAGAGA 


TCCACGTACA 


ATCCCCGCAG 


AC G G GAAGAA 


GGGCTCTGCC 


ACTCAGGGCG 


1680 


TGGTAAGATC 


CCACTTCTTG 


TTCACTTCGG 


ATAATGGCCT 


AATTACTATT 


GCAGGTGGTA 


1740 


AATGGACTAC 


TTACAGACAA 


ATGGCTGAGG 


AAACAGTCGA 


CAAAGTTGTC 


GAAGTTGGCG 


1800 


GATTCCACAA 


CCTGAAACCT 


TGTCACACAA 


GAGATATTAA 


GCTTGCTGGT 


GCAGAAGAAT 


1860 


GGACGCAAAA 


CTATGTGGCT 


TTATTGGCTC 


AAAACTACCA 


. TTT AT CAT C A 


AAAATGTCCA 


1920 
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ACTACTTGGT TCAAAACTAC GGAACCCGTT CCTCTATCAT TTGCGAATTT TTCAAAGAAT 198 0 

C CAT GGAAAA TAAACTGCCT TTGTCCTTAG CCGACAAGGA AAATAACGTA AT C TACT CT A 2040 

GCGAGGAGAA CAACTTGGTC AATTTT GAT A CTTTCAGATA T C C ATT C AC A ATCGGTGAGT 2100 

TAAAGTATTC CAT G C AGT AC GAATATT GTA GAACTCCCTT GGACTTCCTT TTAAGAAGAA 2160 

CAAGATTCGC CTTCTTGGAC GCCAAGGAAG CTTTGAATGC CGTGCATGCC ACCGTCAAAG 2220 

TTATGGGTGA TGAGTTCAAT TGGTCGGAGA AAAAGAGGCA GTGGGAACTT GAAAAAACTG 22 8 0 

T GAACTT CAT CCAAGGACGT TTCGGTGTCT AAATCGATCA TGATAGTTAA GGGTGACAAA 234 0 

GATAACATTC ACAAGAGTAA TAATAATGGT AAT GAT GAT A ATAATAATAA TGATAGTAAT 24 00 

AACAATAATA ATAATGGTGG T AAT G G C AAT GAAATCGCTA TTATTACCTA TTTTCCTTAA 2 4 60 

TGGAAGAGTT AAAGTAAACT AAAAAAACTA CAAAAATATA TGAAGAAAAA AAAAAAAAGA 2 52 0 

GGTAATAGAC T CT ACTACTA C AAT T GAT CT - T C AAAT TAT G ACCTTCCTAG T GT TT AT ATT 2580 

CTATTT CCAA TACATAATAT AAT CTAT ATA ATCATTGCTG GTAGACTTCC GTTTTAATAT 2 64 0 

CGTTTTAATT ATCCCCTTTA TCTCTAGTCT AGTTTTATCA TAAAATATAG AAACACTAAA 27 00 

T AAT ATT CT T CAAACGGTCC TGGTGCATAC G CAAT AC AT A TTTATGGTGC AAAAAAAAAA 27 60 

AT G G AAAAT T TTGCTAGTCA TAAACCCTTT CATAAAACAA T AC GT AGACA TCGCTACTTG 2 82 0 

AAATTTTCAA GTTTTTATCA GAT C CAT GT T TCCTATCTGC CTTGACAACC TCATCGTCGA 28 80 

AAT AGT AC C A TTTAGAACGC C CAAT AT T C A CATTGTGTTC AAGGTCTTTA TTCACCAGTG 2 94 0 

ACGTGTAATG G C CAT G ATTA ATGTGCCTGT ATGGTTAACC ACTCCAAATA GCTTATATTT 3000 

CATAGTGTCA TTGTTTTTCA AT AT AAT GT T TAGT AT CAAT G GAT AT GTTA CGACGGTGTT 3060 
ATTTTTCTTG GTCAAATCGT AAT AAAAT CT C GAT AAAT GG AT GACTAAGA TTTTTGGTAA 3120 

AGTTACAAAA TTTATCGTTT TCACTGTTGT CAATTTTTTG TTCTTGTAAT CACTCGAG 317 8 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 816 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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ATGAAACGTT TCAATGTTTT AAAATATATC AGAACAACAA AAGCAAATAT AC AAAC CAT C 60 

GCAATGCCTT TGACCACAAA ACCTTTATCT TTGAAAATCA ACGCCGCTCT ATTCGATGTT 12 0 

GACGGTACCA TCATCATCTC TCAACCAGCC ATTGCTGCTT TCTGGAGAGA TTTCGGTAAA 18 0 

GACAAGCCTT ACTTCGATGC CGAACACGTT ATTCACATCT CTCACGGTTG GAGAACT T AC 24 0 

GAT GC C ATT G CCAAGTTCGC TCCAGACTTT GCTGATGAAG AAT AC GTTAA CAAGCTAGAA 300 

GGTGAAATCC CAGAAAAGTA CGGTGAACAC TCCATCGAAG TTCCAGGTGC TGTCAAGTTG 360 

TGTAATGCTT TGAACGCCTT GCCAAAGGAA AAATGGGCTG TCGCCACCTC TGGTACCCGT 42 0 

GACATGGCCA AGAAATGGTT CGACATTTTG AAGATCAAGA GACCAGAATA CTTCATCACC 4 80 

G C C AAT GAT G TCAAGCAAGG TAAGCCTCAC C CAGAAC CAT ACTTAAAGGG TAG AAAC G GT 54 0 

TTGGGTTTCC CAATTAATGA ACAAGACCCA TCCAAATCTA AGGTTGTTGT CTTTGAAGAC 600 

GCACCAGCTG GTATTGCTGC TGGTAAGGCT GCTGGCTGTA AAATCGTTGG TATTGCTACC 660 

ACTTTCGATT TGGACTTCTT GAAGGAAAAG GGTTGTGACA TCATTGTCAA GAACCACGAA 720 

TCTATCAGAG TCGGTGAATA CAACGCTGAA AC C GAT GAAG TCGAATTGAT CTTTGATGAC 7 80 

TACTTATACG CTAAGGATGA CTTGTTGAAA TGGTAA 816 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 753 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPP2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATGGGATTGA CTACTAAACC TCTATCTTTG AAAGTTAACG CCGCTTTGTT CGACGTCGAC 60 

GGTACCATTA TCATCTCTCA ACC AG C CAT T GCTGCATTCT GGAGGGATTT CGGTAAGGAC 12 0 

AAAC CTT ATT TCGATGCTGA ACACGTTATC CAAGTCTCGC ATGGTTGGAG AACGTTT GAT 18 0 

GCCATTGCTA AGTTCGCTCC AGACTTTGCC AAT GAAGAGT ATGTTAACAA ATTAGAAGCT 24 0 

GAAATTCCGG TCAAGTACGG TGAAAAATCC ATTGAAGTCC CAGGTGCAGT TAAGCTGTGC. 300 

AACGCTTTGA ACGCTCTACC AAAAGAGAAA TGGGCTGTGG CAACTTCCGG TACCCGTGAT 360 
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420 
480 
540 
600 



720 
753 



AT G GC AC AAA AATGGTTCGA GCATCTGGGA AT CAGG AG AC CAAAGTACTT CAT T AC C G CT 
AATGATGTCA AACAG GGTAA GCCTCATCCA GAAC CAT AT C TGAAGGGCAG GAATGGCTTA 
GGATATCCGA TCAATGAGCA AGACCCTTCC AAATCTAAGG TAGTAGTATT TGAAGACGCT 
CCAGCAGGTA TTGCCGCCGG AAAAGCCGCC GGTTGTAAGA TCATTGGTAT TGCCACTACT 
TTCGACTTGG ACTTCCTAAA GGAAAAAGGC TGTGACATCA TTGTCAAAAA CCACGAATCC 660 
AT CAGAGTT G GCGGCTACAA TGCCGAAACA GACGAAGTTG AATT CATTTT T GAC G ACT AC 
TTATATGCTA AGGAC GAT CT GTTGAAATGG TAA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2520 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GUT1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGTATTGGCC AC GAT AACCA CCCTTTGTAT ACTGTTTTTG TTTTTCACAT GGTAAATAAC 

GACTTTTATT AAACAACGTA TGTAAAAACA TAACAAGAAT CTACCCATAC AGGCCATTTC 

GTAATTCTTC TCTTCTAATT GGAGTAAAAC CATCAATTAA AGGGTGTGGA GTAGCATAGT 

GAGGGGCTGA CTGCATTGAC AAAAAAATTG AAAAAAAAAA AGGAAAAGGA AAGGAAAAAA 

AG AC AG C C AA GACTTTTAGA AC G GATAAGG TGTAATAAAA TGTGGGGGGA TGCCTGTTCT 

C GAAC CAT AT AAAAT AT AC C ATGTGGTTTG AGTTGTGGCC G G AACT AT AC AAATAGTTAT 

ATGTTTCCCT CTCTCTTCCG ACTTGTAGTA TTCTCCAAAC GTTACATATT CCGAT CAAGC 

CAGCGCCTTT ACACTAGTTT AAAACAAGAA CAGAGCCGTA TGTCC AAAAT AAT G G AAG AT 

TTACGAAGTG ACTACGTCCC GCTTATCGCC AGTATT GAT G TAG GAAC GAC CT CAT C CAGA 

TGCATTCTGT T C AACAG AT G GGGCCAGGAC GTTTCAAAAC AC CAAATT GA ATATTCAACT 

TCAGCATCGA AGGGCAAGAT TGGGGTGTCT GGCCTAAGGA GACCCTCTAC AGCCCCAGCT 

CGTGAAACAC CAAACGCCGG T GAC AT C AAA ACCAGCGGAA AGCCCATCTT TTCTGCAGAA 

GGCTATGCCA TTCAAGAAAC CAAATTCCTA AAAAT C GAG G AATTGGACTT GGACTTCCAT 

AACGAACCCA CGTTGAAGTT CCCCAAACCG GGTTGGGTTG AGTGCCATCC GCAGAAATTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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CTGGTGAACG TCGTCCAATG CCTTGCCTCA AGTTTGCTCT CTCTGCAGAC TATCAACAGC 900 

GAACGTGTAG CAAACGGTCT CCCACCTTAC AAG GTAATAT G CAT G GGTAT AG C AAAC AT G 960 

AGAGAAACCA CAATTCTGTG GTCCCGCCGC ACAGGAAAAC CAATT GTTAA CTACGGTATT 102 0 

GTTTGGAACG ACACCAGAAC GAT CAAAAT C GTTAGAGACA AATGGCAAAA CACTAGCGTC 108 0 

GATAGGCAAC TGCAGCTTAG ACAGAAGACT GGATTGCCAT TGCTCTCCAC GTATTTCTCC 114 0 

TGTTCCAAGC TGCGCTGGTT CCTCGACAAT GAGCCTCTGT GTACCAAGGC GT AT GAG GAG 1200 

AAC GAC CT GA TGTTCGGCAC TGTGGACACA TGGCTGATTT ACCAATTAAC TAAACAAAAG 12 60 

GCGTTCGTTT CTGACGTAAC CAACGCTTCC AGAACT G GAT TTATGAACCT CTCCACTTTA 1320 

AAGTACGACA ACGAGTTGCT GGAATTTTGG GGTATTGACA AGAACCTGAT TCACATGCCC 138 0 

GAAATTGTGT CCTCATCTCA ATACTACGGT GACTTTGGCA TTCCTGATTG GATAATGGAA 14 4 0 

AAGCTACACG ATTCGCCAAA AACAGTACTG CGAGATCTAG TCAAGAGAAA CCTGCCCATA 1500 

CAGGGCTGTC TGGGCGACCA AAGCGCATCC ATGGTGGGGC AACTCGCTTA CAAACCCGGT 1560 

GCTG CAAAAT GTACTTATGG TACCGGTTGC TTTTTACTGT AC AAT AC G G G GACCAAAAAA 1620 

TTGATCTCCC AACATGGCGC ACTGACGACT CTAGCATTTT GGTTCCCACA TTTGCAAGAG 1680 

TACGGTGGCC AAAAACCAGA ATT GAG CAAG CCACATTTTG CATTAGAGGG TTCCGTCGCT 174 0 

GTGGCTGGTG CTGTGGTCCA ATGGCTACGT GATAATTTAC GATTGATCGA TAAAT C AG AG 18 00 

GATGTCGGAC CGATT G CAT C TACGGTTCCT GATTCTGGTG GCGTAGTTTT CGTCCCCGCA 18 60 

TTTAGTGGCC TATTCGCTCC CTATTGGGAC CCAGATGCCA GAGCCACCAT AATGGGGATG 1920 

TCTCAATTCA CTACTGCCTC CCACATCGCC AGAGCTGCCG TGGAAGGTGT TTGCTTTCAA 1980 

GCCAGGGCTA TCTTGAAGGC AATGAGTTCT GACGCGTTTG -GTGAAGGTTC CAAAGACAGG 2 04 0 

GACTTTTTAG AGGAAATTTC CGACGTCACA TAT GAAAAGT CGCCCCTGTC GGTTCTGGCA 2100 

GTGGATGGCG GGATGTCGAG GTCTAATGAA GTCATGCAAA TTCAAGCCGA TATCCTAGGT 2160 

CCCTGTGTCA AAGT CAGAAG GTCTCCGACA GCGGAATGTA CCGCATTGGG GGCAGCCATT 2220 

GCAGCCAATA TGGCTTTCAA GGATGTGAAC GAGCGCCCAT TAT G GAAGG A CCTACACGAT 2280 

GT T AAGAAAT GGGTCTTTTA CAATGGAATG GAGAAAAACG AACAAATATC AC C AGAG G CT 234 0 

CATCCAAACC TTAAGAT AT T CAGAAGT GAA TCCGACGATG CTGAAAGGAG AAAG CAT T GG 24 0 0 

AAGT AT T G G G AAGTTGCCGT GGAAAGATCC AAAGGTTGGC T G AAG GAC AT AGAAGGTGAA " 24 60 

CACGAACAGG TTCTAGAAAA CTTCCAATAA CAACATAAAT AATTTCTATT AACAATGTAA 252 0 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPD1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ser Ala Ala Ala Asp Arg Leu Asn Leu Thr Ser Gly His Leu Asn 
1 5 10 15 

Ala Gly Arg Lys Arg Ser Ser Ser Ser Val Ser Leu Lys Ala Ala Glu 
20 25 30 

Lys Pro Phe Lys Val Thr Val He Gly Ser Gly Asn Trp Gly Thr Thr 
35 40 45 

He Ala Lys Val Val Ala Glu Asn Cys Lys Gly Tyr Pro Glu Val Phe 
50 " 55 60 

Ala Pro He Val Gin Met Trp Val Phe Glu Glu Glu He Asn Gly Glu 
65 70 75 80 

Lvs Leu Thr Glu He He Asn Thr Arg His Gin Asn Val Lys Tyr Leu 
85 90 95 

Pro Gly He Thr Leu Pro Asp Asn Leu Val Ala Asn Pro Asp Leu He 
100 105 HO 

Asp Ser Val Lys Asp Val Asp He He Val Phe Asn He Pro His Gin 
115 120 125 

Phe Leu Pro Arg He Cys Ser Gin Leu Lys Gly His Val Asp Ser His 
130 " 135 140 

Val Arg Ala He Ser Cys Leu Lys Gly Phe Glu Val Gly Ala Lys Gly 
145 150 155 160 

Val Gin Leu Leu Ser Ser Tyr He Thr Glu Glu Leu Gly He Gin Cys 
165 170 175 

Glv Ala Leu Ser Gly Ala Asn He Ala Thr Glu Val Ala Gin Glu His 
180 " 185 190 

Trp Ser Glu Thr Thr Val Ala Tyr His He Pro Lys Asp Phe Arg Gly 
195 200 205 

Glu Gly Lys Asp Val Asp His Lys Val Leu Lys Ala Leu Phe His Arg 
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210 215 220 

Pro Tyr Phe His Val Ser Val He Glu Asp Val Ala Gly He Ser He 
225 230 235 240 

Cys Gly Ala Leu Lys Asn Val Val Ala Leu Gly Cys Gly Phe Val Glu 
245 250 255 

Gly Leu Gly Trp Gly Asn Asn Ala Ser Ala Ala He Gin Arg Val Gly 
260 ~ 265 270 

Leu Gly Glu He He Arg Phe Gly Gin Met Phe Phe Pro Glu Ser Arg 
275 280 285 

Glu Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu He Thr 
290 295 300 

Thr Cys Ala Gly Gly Arg Asn Val Lys Val Ala Arg Leu Met Ala Thr 
305 310 315 320 

Ser Gly Lys Asp Ala Trp Glu Cys Glu Lys Glu Leu Leu Asn Gly Gin 
325 330 335 

Ser Ala Gin Gly Leu He Thr Cys Lys Glu Val His Glu Trp Leu Glu 
340 345 350 

Thr Cys Gly Ser Val Glu Asp Phe Pro Leu Phe Glu Ala Val Tyr Gin 
355 360 365 

He Val Tyr Asn Asn Tyr Pro Met Lys Asn Leu Pro Asp Met He Glu 
370 375 380 

Glu Leu Asp Leu His Glu Asp 
385 390 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNES S : unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPD2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Thr Ala His Thr Asn He Lys Gin His Lys His Cys His Glu Asp 
15 10 15 



His Pro He Arg Arg Ser Asp Ser Ala Val Ser He Val His Leu Lys 
20 25 30 
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Arg Ala Pro Phe Lys Val Thr Val He Gly Ser Gly Asn Trp Gly Thr 
35 40 45 

Thr He Ala Lys Val He Ala Glu Asn Thr Glu Leu His Ser His He 
50 55 60 

Phe Glu Pro Glu Val Arg Met Trp Val Phe Asp Glu Lys He Gly Asp 
65 70 75 80 

Glu Asn Leu Thr Asp He He Asn Thr Arg His Gin Asn Val Lys Tyr 
85 90 95 

Leu Pro Asn He Asp Leu Pro His Asn Leu Val Ala Asp Pro Asp Leu 
100 105 HO 

Leu His Ser He Lys Gly Ala Asp He Leu Val Phe Asn He Pro His 
115 120 125 

Gin Phe Leu Pro Asn He Val Lys Gin Leu Gin Gly His Val Ala Pro 
130 135 140 

His Val Arg Ala He Ser Cys Leu Lys Gly Phe Glu Leu Gly Ser Lys 
145 150 155 160 

Gly Val Gin Leu Leu Ser Ser Tyr Val Thr Asp Glu Leu Gly He Gin 
165 170 175 

Cys Gly Ala Leu Ser Gly Ala Asn Leu Ala Pro Glu Val Ala Lys Glu 
180 185 190 

His Trp Ser Glu Thr Thr Val Ala Tyr Gin Leu Pro Lys Asp Tyr Gin 
195 200 205 

Gly Asp Gly Lys Asp Val Asp His Lys He Leu Lys Leu Leu Phe His 
210 " 215 220 

Arg Pro Tyr Phe His Val Asn Val lie Asp Asp Val Ala Gly lie Ser 
225 230 235 240 

He Ala Gly Ala Leu Lys Asn Val Val Ala Leu Ala Cys Gly Phe Val 
245 250 255 

Glu Gly Met Gly Trp Gly Asn Asn Ala Ser Ala Ala He Gin Arg Leu 
2 60 2 65 27 0 

Gly Leu Gly Glu He lie Lys Phe Gly Arg Met Phe Phe Pro Glu Ser 
275 280 285 

Lys Val Glu Thr Tyr Tyr Gin Glu Ser Ala Gly Val Ala Asp Leu He 
290 295 300 

Thr Thr Cys Ser Gly Gly Arg Asn Val Lys Val Ala Thr Tyr Met Ala 
305 310 315 320 

Lys Thr Gly ~Lys Ser Ala Leu Glu Ala Glu Lys Glu Leu Leu Asn Gly 
325 330 335 
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Gin Ser Ala Gin Gly lie lie Thr Cys Arg Glu Val His Glu Trp Leu 
340 345 350 

Gin Thr Cys Glu Leu Thr Gin Glu Phe Pro lie lie Arg Gly Ser Leu 
355 360 365 

Pro Asp Ser Leu Gin Gin Arg Pro His Gly Arg Pro Thr Gly Asp Asp 
370 375 380 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 614 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: GUT2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Thr Arg Ala Thr Trp Cys Asn Ser Pro Pro Pro Leu His Arg Gin 
1 5 10 15 

Val Ser Arg Arg Asp Leu Leu Asp Arg Leu Asp Lys Thr His Gin Phe 
20 25 30 

Asp Val Leu lie lie Gly Gly Gly Ala Thr Gly Thr Gly Cys Ala Leu 
35 4 0 45 

Asp Ala Ala Thr Arg Gly Leu Asn Val Ala Leu Val Glu Lys Gly Asp 
50 55 60 

Phe Ala Ser Gly Thr Ser Ser Lys Ser Thr Lys Met lie His Gly Gly 
65 70 75 80 

Val Arg Tyr Leu Glu Lys Ala Phe Trp Glu Phe Ser Lys Ala Gin Leu 
85 90 95 

Asp Leu Val lie Glu Ala Leu Asn Glu Arg Lys His Leu lie Asn Thr 
100 105 110 

/Via Pro His Leu Cys Thr Val Leu Pro lie Leu lie Pro lie Tyr Ser 
115 120 125 

Thr Trp Gin Val Pro Tyr lie Tyr Met Gly Cys Lys Phe Tyr Asp Phe 
130 135 140 

Phe Gly Gly Ser Gin Asn Leu Lys Lys Ser Tyr Leu Leu Ser Lys Ser 
145 150 155 160 



Ala Thr Val Glu Lys Ala Pro Met Leu Thr Thr Asp Asn Leu Lys Ala 
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165 170 175 

Ser Leu Val Tyr His Asp Gly Ser Phe Asn Asp Ser Arg Leu Asn Ala 
180 185 190 

Thr Leu Ala lie Thr Gly Val Glu Asn Gly Ala Thr Val Leu He Tyr 
195 200 205 

Val Glu Val Gin Lys Leu He Lys Asp Pro Thr Ser Gly Lys Val He 
210 215 220 

Gly Ala Glu Ala Arg Asp Val Glu Thr Asn Glu Leu Val Arg He Asn 
225 230 235 240 

Ala Lys Cys Val Val Asn Ala Thr Gly Pro Tyr Ser Asp Ala He Leu 
245 250 255 

Gin Met Asp Arg Asn Pro Ser Gly Leu Pro Asp Ser Pro Leu Asn Asp 
260 265 270 

Asn Ser Lys He Lys Ser Thr Phe Asn Gin He Ser Val Met Asp Pro 
275 280 285 

Lys Met Val He Pro Ser He Gly Val His He Val Leu Pro Ser Phe 
290 295 300 

Tyr Ser Pro Lys Asp Met Gly Leu Leu Asp Val Arg Thr Ser Asp Gly 
305 * 310 315 320 

Arg Val Met Phe Phe Leu Pro Trp Gin Gly Lys Val Leu Ala Gly Thr 
325 330 335 

Thr Asp He Pro Leu Lys Gin Val Pro Glu Asn Pro Met Pro Thr Glu 
340 345 350 

Ala Asp He Gin Asp He Leu Lys Glu Leu Gin His Tyr He Glu Phe 
355 360 365 

Pro Val Lys Arg Glu Asp Val Leu Ser Ala Trp Ala Gly Val Arg Pro 
370 375 380 

Leu Val Arg Asp Pro Arg Thr He Pro Ala Asp Gly Lys Lys Gly Ser 
385 * 390 395 400 

Ala Thr Gin Gly Val Val Arg Ser His Phe Leu Phe Thr Ser Asp Asn 
405 410 415 

Gly Leu He Thr He Ala Gly Gly Lys Trp Thr Thr Tyr Arg Gin Met 
420 425 430 

Ala Glu Glu Thr Val Asp Lys Val Val Glu Val Gly Gly Phe His Asn 
435 440 445 

Leu Lys Pro Cys His Thr Arg Asp He Lys Leu Ala Gly Ala Glu Glu 
450 455 460 
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Trp Thr Gin Asn Tyr Val Ala Leu Leu Ala Gin Asn Tyr His Leu Ser 
465 470 475 480 

Ser Lys Met Ser Asn Tyr Leu Val Gin Asn Tyr Gly Thr Arg Ser Ser 
485 490 495 

lie lie Cys Glu Phe Phe Lys Glu Ser Met Glu Asn Lys Leu Pro Leu 
500 505 510 

Ser Leu Ala Asp Lys Glu Asn Asn Val lie Tyr Ser Ser Glu Glu Asn 
515 520 525 

Asn Leu Val Asn Phe Asp Thr Phe Arg Tyr Pro Phe Thr lie Gly Glu 
530 535 540 

Leu Lys Tyr Ser Met Gin Tyr Glu Tyr Cys Arg Thr Pro Leu Asp Phe 
545 550 555 560 

Leu Leu Arg Arg Thr Arg Phe Ala Phe Leu Asp Ala Lys Glu Ala Leu 
565 570 575 

Asn Ala Val His Ala Thr Val Lys Val Met Gly Asp Glu Phe Asn Trp 
580 585 590 

Ser Glu Lys Lys Arg Gin Trp Glu Leu Glu Lys Thr Val Asn Phe lie 
595 600 - 605 

Gin Gly Arg Phe Gly Val 
610 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 339 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPSA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asn Gin Arg Asn Ala Ser Met Thr Val lie Gly Ala Gly Ser Tyr 
15 10 15 

Gly Thr Ala Leu Ala lie Thr Leu Ala Arg Asn Gly His Glu Val Val 
20 25 30 

Leu Trp Gly His Asp Pro Glu His lie Ala Thr Leu Glu Arg Asp Arg 
35 4 0 4 5 

Cys Asn Ala- Ala Phe Leu Pro Asp Val Pro Phe Pro Asp Thr Leu His 
50 55 60 
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Leu Glu Ser Asp Leu Ala Thr Ala Leu Ala Ala Ser Arg Asn lie Leu 
65 ' 70 75 80 

Val Val Val Pro Ser His Val Phe Gly Glu Val Leu Arg Gin He Lys 
85 90 95 

Pro Leu Met Arg Pro Asp Ala Arg Leu Val Trp Ala Thr Lys Gly Leu 
100 105 HO 

Glu Ala Glu Thr Gly Arg Leu Leu Gin Asp Val Ala Arg Glu Ala Leu 
115 120 125 

Gly Asp Gin He Pro Leu Ala Val lie Ser Gly Pro Thr Phe Ala Lys 
130 135 140 

Glu Leu Ala Ala Gly Leu Pro Thr Ala He Ser Leu Ala Ser Thr Asp 
145 150 155 160 

Gin Thr Phe Ala Asp Asp Leu Gin Gin Leu Leu His Cys Gly Lys Ser 
165 170 175 

Phe Arg Val Tyr Ser Asn Pro Asp Phe He Gly Val Gin Leu Gly Gly 
180 185 190 

Ala Val Lys Asn Val He Ala He Gly Ala Gly Met Ser Asp Gly He 
195 200 205 

Gly Phe Gly Ala Asn Ala Arg Thr Ala Leu He Thr Arg Gly Leu Ala 
210 ^ 215 220 

Glu Met Ser Arg Leu Gly Ala Ala Leu Gly Ala Asp Pro Ala Thr Phe 
225 230 235 240 

Met Gly Met Ala Gly Leu Gly Asp Leu Val Leu Thr Cys Thr Asp Asn 
245 250 255 

Gin Ser Arg Asn Arg Arg Phe Gly Met Met Leu Gly Gin Gly Met Asp 
260 265 270 

Val Gin Ser Ala Gin Glu Lys He Gly Gin Val Val Glu Gly Tyr Arg 
275 280 285 

Asn Thr Lys Glu Val Arg Glu Leu Ala His Arg Phe Gly Val Glu Met 
290 295 300 

Pro He Thr Glu Glu He Tyr Gin Val Leu Tyr Cys Gly Lys Asn Ala 
305 310 315 320 

Arg Glu Ala Ala Leu Thr Leu Leu Gly Arg Ala Arg Lys Asp Glu Arg 
325 330 335 

Ser Ser His 



(2) INFORMATION FOR SEQ ID NO: 15: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 501 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GLPD 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Glu Thr Lys Asp Leu lie Val lie Gly Gly Gly lie Ash Gly Ala 
15 10 15 

Gly lie Ala Ala Asp Ala Ala Gly Arg Gly Leu Ser Val Leu Met Leu 
20 25 30 

Glu Ala Gin Asp Leu Ala Cys Ala Thr Ser Ser Ala Ser Ser Lys Leu 
35 4 0 4 5 

lie His Gly Gly Leu Arg Tyr Leu Glu His Tyr Glu Phe Arg Leu Val 
50 55 60 

Ser Glu Ala Leu Ala Glu Arg Glu Val Leu Leu Lys Met Ala Pro His 
65 70 75 80 

lie Ala Phe Pro Met Arg Phe Arg Leu Pro His Arg Pro His Leu Arg 
85 90 95 

Pro Ala Trp Met lie Arg lie Gly Leu Phe Met Tyr Asp His Leu Gly 
100 105 110 

Lys Arg Thr Ser Leu Pro Gly Ser Thr Gly Leu Arg Phe Gly Ala Asn 
115 120 125 

Ser Val Leu Lys Pro Glu lie Lys Arg Gly Phe Glu Tyr Ser Asp Cys 
130 135 140 

Trp Val Asp Asp Ala Arg Leu Val Leu Ala Asn Ala Gin Met Val Val 
145 " 150 155 160 

Arg Lys Gly Gly Glu Val Leu Thr Arg Thr Arg Ala Thr Ser Ala Arg 
165 170 175 

Arg Glu Asn Gly Leu Trp lie Val Glu Ala Glu Asp lie Asp Thr Gly 
180 185 190 

Lys Lys Tyr Ser Trp Gin Ala Arg Gly Leu Val Asn Ala Thr Gly Pro 
195 200 205 

Trp Val Lys Gin Phe Phe Asp Asp Gly Met His Leu Pro Ser Pro Tyr 
210 215 220 
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Gly lie Arg Leu lie Lys Gly Ser His lie Val Val Pro Arg Val His 
225 230 235 240 

Thr Gin Lys Gin Ala Tyr lie Leu Gin Asn Glu Asp Lys Arg lie Val 
245 250 255 

Phe Val lie Pro Trp Met Asp Glu Phe Ser lie lie Gly Thr Thr Asp 
260 265 270 

Val Glu Tyr Lys Gly Asp Pro Lys Ala Val Lys lie Glu Glu Ser Glu 
275 " * 280 285 

lie Asn Tyr Leu Leu Asn Val Tyr Asn Thr His Phe Lys Lys Gin Leu 
290 295 300 

Ser Arg Asp Asp lie Val Trp Thr Tyr Ser Gly Val Arg Pro Leu Cys 
305 310 315 320 

Asp Asp Glu Ser Asp Ser Pro Gin Ala lie Thr Arg Asp Tyr Thr Leu 
325 _ 330 335 

Asp lie His Asp Glu Asn Gly Lys Ala Pro Leu Leu Ser Val Phe Gly 
340 345 350 

Gly Lys Leu Thr Thr Tyr Arg Lys Leu Ala Glu His Ala Leu Glu Lys 
355. 360 365 

Leu Thr Pro Tyr Tyr Gin Gly lie Gly Pro Ala Trp Thr Lys Glu Ser 
370 375 380 

Val Leu Pro Gly Gly Ala lie Glu Gly Asp Arg Asp Asp Tyr Ala Ala 
385 390 395 400 

Arg Leu Arg Arg Arg Tyr Pro Phe Leu Thr Glu Ser Leu Ala Arg His 
405 410 415 

Tyr Ala Arg Thr Tyr Gly Ser Asn Ser Glu Leu Leu Leu Gly Asn Ala 
420 425 430 

Gly Thr Val Ser Asp Leu Gly Glu Asp Phe Gly His Glu Phe Tyr Glu 
435 440 445 

Ala Glu Leu Lys Tyr Leu Val Asp His Glu Trp Val Arg Arg Ala Asp 
450 455 460 

Asp Ala Leu Trp Arg Arg Thr Lys Gin Gly Met Trp Leu Asn Ala Asp 
465 470 475 480 

Gin Gin Ser Arg Val Ser Gin Trp Leu Val Glu Tyr Thr Gin Gin Arg 
485 490 495 

Leu Ser Leu Ala Ser 
500 



(2) INFORMATION FOR SEQ ID NO: 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GLPABC 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Lys Thr Arg Asp Ser Gin Ser Ser Asp Val He He He Gly Gly 
15 10 15 

Gly Ala Thr Gly Ala Gly He Ala Arg Asp Cys Ala Leu Arg Gly Leu 
20 25 30 

Arg Val He Leu Val Glu Arg His Asp He Ala Thr Gly Ala Thr Gly 
35 40 45 

Arg Asn His Gly Leu Leu His Ser Gly Ala Arg Tyr Ala Val Thr Asp 
50 55 60 

Ala Glu Ser Ala Arg Glu Cys He Ser Glu Asn Gin He Leu Lys Arg 
65 70 75 80 

He Ala Arg His Cys Val Glu Pro Thr Asn Gly Leu Phe He Thr Leu 
85 90 95 



Pro Glu Asp Asp Leu 
100 

Glu Ala Gly He Ser 
115 

He Glu Pro Ala Val 
130 

Asp Gly Thr Val Asp 
145 

Ala Lys Glu His Gly 
165 

Leu He Arg Glu Gly 
180 

Leu Thr Gly Glu Thr 
195 

Ala Gly He Trp Gly 
210 

Arg Met Phe Pro Ala 



Ser Phe Gin Ala Thr Phe 
105 

Ala Glu Ala He Asp Pro 
120 

Asn Pro Ala Leu He Gly 
135 

Pro Phe Arg Leu Thr Ala 
150 155 

Ala Val He Leu Thr Ala 
170 

Ala Thr Val Cys Gly Val 
185 

Gin Ala Leu His Ala Pro 
200 

Gin His He Ala Glu Tyr 
215 

Lys Gly Ser Leu Leu He 



He Arg Ala Cys Glu 
110 

Gin Gin Ala Arg He 
125 

Ala Val Lys Val Pro 
140 

Ala Asn Met Leu Asp 
160 

His Glu Val Thr Gly 
175 

Arg Val Arg Asn His 
190 

Val Val Val Asn Ala 
205 

Ala Asp Leu Arg He 
220 

Met Asp His Arg He 
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225 230 235 240 

Asn Gin His Val lie Asn Arg Cys Arg Lys Pro Ser Asp Ala Asp lie 
245 250 255 

Leu Val Pro Gly Asp Thr He Ser Leu He Gly Thr Thr Ser Leu Arg 
260 265 270 

He Asp Tyr Asn Glu He Asp Asp Asn Arg Val Thr Ala Glu Glu Val 
275 280 285 

Asp He Leu Leu Arg Glu Gly Glu Lys Leu Ala Pro Val Met Ala Lys 
290 295 300 

Thr Arg He Leu Arg Ala Tyr Ser Gly Val Arg Pro Leu Val Ala Ser 
305 " 310 315 320 

Asp Asp Asp Pro Ser Gly Arg Asn Leu Ser Arg Gly He Val Leu Leu 
325 330 335 

Asp His Ala Glu Arg Asp Gly Leu Asp Gly Phe He Thr He Thr Gly 
340 345 350 

Gly Lys Leu Met Thr Tyr Arg Leu Met Ala Glu Trp Ala Thr Asp Ala 
355 360 365 

Val Cys Arg Lys Leu Gly Asn Thr Arg Pro Cys Thr Thr Ala Asp Leu 
370 375 380 

Ala Leu Pro Gly Ser Gin Glu Pro Ala Glu Val Thr Leu Arg Lys Val 
385 390 395 400 

He Ser Leu Pro Ala Pro Leu Arg Gly Ser Ala Val Tyr Arg His Gly 
405 410 415 

Asp Arg Thr Pro Ala Trp Leu Ser Glu Gly Arg Leu His Arg Ser Leu 
420 425 430 

Val Cys Glu Cys Glu Ala Val Thr Ala Gly Glu Val Gin Tyr Ala Val 
435 440 445 

Glu Asn Leu Asn Val Asn Ser Leu Leu Asp Leu Arg Arg Arg Thr Arg 
450 455 460 

Val Gly Met Gly Thr Cys Gin Gly Glu Leu Cys Ala Cys Arg Ala Ala 
465 " 470 475 480 

Gly Leu Leu Gin Arg Phe Asn Val Thr Thr Ser Ala Gin Ser He Glu 
485 490 495 

Gin Leu Ser Thr Phe Leu Asn Glu Arg Trp Lys Gly Val Gin Pro He 
500 505 510 

Ala Trp Gly Asp Ala Leu Arg Glu Ser Glu Phe Thr Arg Trp Val Tyr 
515 520 525 
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Gin Gly Leu Cys Gly Leu Glu Lys Glu Gin Lys Asp Ala Leu 
530 535 540 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 250 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GPP2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gly Leu Thr Thr Lys Pro Leu Ser Leu Lys Val Asn Ala Ala Leu 
1 5 ,10 15 

Phe Asp Val Asp Gly Thr lie lie lie Ser Gin Pro Ala lie Ala Ala 
20 25 30 

Phe Trp Arg Asp Phe Gly Lys Asp Lys Pro Tyr Phe Asp Ala Glu His 
35 40 45 

Val lie Gin Val Ser His Gly Trp Arg Thr Phe Asp Ala lie Ala Lys 
50 55 60 

Phe Ala Pro Asp Phe Ala Asn Glu Glu Tyr Val Asn Lys Leu Glu Ala 
65 70 75 80 



Glu He Pro Val 



Val Lys Leu Cys 
100 

Val Ala Thr Ser 
115 

Leu Gly He Arg 
130 

Gin Gly Lys Pro 
145 

Gly Tyr Pro He 



Phe Glu Asp Ala 
180 



Lys Tyr Gly Glu 
85 

Asn Ala Leu Asn 



Gly Thr Arg Asp 
120 

Arg Pro Lys Tyr 
135 

His Pro Glu Pro 
150 

Asn Glu Gin Asp 
165 

Pro Ala Gly He 



Lys Ser He Glu 
90 

Ala Leu Pro Lys 
105 

Met Ala Gin Lys 



Phe He Thr Ala 
140 

Tyr Leu Lys Gly 
155 

Pro Ser Lys Ser 
170 

Ala Ala Gly Lys 
185 



Val Pro Gly Ala 
95 

Glu Lys Trp Ala 
110 

Trp Phe Glu His 
125 

Asn Asp Val Lys 



Arg Asn Gly Leu 
160 

Lys Val Val Val 
175 

Ala Ala Gly Cys 
190 



Lys He He Gly He Ala Thr Thr Phe Asp Leu Asp Phe Leu Lys Glu 
195 200 205 
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Lys Gly Cys Asp He He Val Lys Asn His Glu Ser He Arg Val Gly 
210 215 220 

Gly Tyr Asn Ala Glu Thr Asp Glu Val Glu Phe He Phe Asp Asp Tyr 
225 230 235 240 

Leu Tyr Ala Lys Asp Asp Leu Leu Lys Trp 
245 250 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 709 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: GUTl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Phe Pro Ser Leu Phe Arg Leu Val Val Phe Ser Lys Arg Tyr He 
1 5 10 15 

Phe Arg Ser Ser Gin Arg Leu Tyr Thr Ser Leu Lys Gin Glu Gin Ser 
20 25 30 

Arg Met Ser Lys He Met Glu Asp Leu Arg Ser Asp Tyr Val Pro Leu 
35 4 0 45 

He Ala Ser He Asp Val Gly Thr Thr Ser Ser Arg Cys He Leu Phe 
50 55 60 

Asn Arg Trp Gly Gin Asp Val Ser Lys His Gin He Glu Tyr Ser Thr 
65 " " 70 75 80 

Ser Ala Ser Lys Gly Lys He Gly Val Ser Gly Leu Arg Arg Pro Ser 
85 90 95 

Thr Ala Pro Ala Arg Glu Thr Pro Asn Ala Gly Asp He Lys Thr Ser 
100 105 HO 

Gly Lys Pro He Phe Ser Ala Glu Gly Tyr Ala He Gin Glu Thr Lys 
115 120 125 

Phe Leu Lys He Glu Glu Leu Asp Leu Asp Phe His Asn Glu Pro Thr 
130 135 140 

Leu Lys Phe Pro Lys Pro Gly Trp Val Glu Cys His Pro Gin Lys Leu 
145 J 150 155 160 

Leu Val Asn Val Val Gin Cys Leu Ala Ser Ser Leu Leu Ser Leu Gin 
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165 170 175 

Thr lie Asn Ser Glu Arg Val Ala Asn Gly Leu Pro Pro Tyr Lys Val 
180 185 190 

lie Cys Met Gly lie Ala Asn Met Arg Glu Thr Thr He Leu Trp Ser 
195 200 205 

Arg Arg Thr Gly Lys Pro He Val Asn Tyr Gly He Val Trp Asn Asp 
210 215 220 

Thr Arg Thr He Lys He Val Arg Asp Lys Trp Gin Asn Thr Ser Val 
225 230 235 240 

Asp Arg Gin Leu Gin Leu Arg Gin Lys Thr Gly Leu Pro Leu Leu Ser 
245 250 255 

Thr Tyr Phe Ser Cys Ser Lys Leu Arg Trp Phe Leu Asp Asn Glu Pro 
260 265 270 

Leu Cys Thr Lys Ala Tyr Glu Glu Asn Asp Leu Met Phe Gly Thr Val 
, ^ 275 280 285 

Asp Thr Trp Leu He Tyr Gin Leu Thr Lys Gin Lys Ala Phe Val Ser 
290 295 300 

Asp Val Thr Asn Ala Ser Arg Thr Gly Phe Met Asn Leu Ser Thr Leu 
305 310 315 320 

Lys Tyr Asp Asn Glu Leu Leu Glu Phe Trp Gly He Asp Lys Asn Leu 
325 330 335 

He His Met Pro Glu lie Val Ser Ser Ser Gin Tyr Tyr Gly Asp Phe 
340 345 350 

Gly He Pro Asp Trp lie Met Glu Lys Leu His Asp Ser Pro Lys Thr 
355 360 365 

Val Leu Arg Asp Leu Val Lys Arg Asn Leu Pro lie Gin Gly Cys Leu 
370 375 380 

Gly Asp Gin Ser Ala Ser Met Val Gly Gin Leu Ala Tyr Lys Pro Gly 
385 390 395 400 

Ala Ala Lys Cys Thr Tyr Gly Thr Gly Cys Phe Leu Leu Tyr Asn Thr 
405 410 415 

Gly Thr Lys Lys Leu lie Ser Gin His Gly Ala Leu Thr Thr Leu Ala 
420 425 430 

Phe Trp Phe Pro His Leu Gin Glu Tyr Gly Gly Gin Lys Pro Glu Leu 
435 440 445 

Ser Lys Pro His Phe Ala Leu Glu Gly Ser Val Ala Val Ala Gly Ala 
450 455 460 
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Val Val Gin Trp Leu Arg Asp Asn Leu Arg Leu lie Asp Lys Ser Glu 
465 470 475 480 

Asp Val Gly Pro lie Ala Ser Thr Val Pro Asp Ser Gly Gly Val Val 
485 490 495 

Phe Val Pro Ala Phe Ser Gly Leu Phe Ala Pro Tyr Trp Asp Pro Asp 
500 505 510 

Ala Arg Ala Thr lie Met Gly Met Ser Gin Phe Thr Thr Ala Ser His 
515 520 525 

lie Ala Arg Ala Ala Val Glu Gly Val Cys Phe Gin Ala Arg Ala lie 
530 " 535 540 

Leu Lys Ala Met Ser Ser Asp Ala Phe Gly Glu Gly Ser Lys Asp Arg 
545 550 555 560 

Asp Phe Leu Glu Glu lie Ser Asp Val Thr Tyr Glu Lys Ser Pro Leu 
565 57 0 575 

Ser Val Leu Ala Val Asp Gly Gly Met Ser Arg Ser Asn Glu Val Met 
580 585 590 

Gin lie Gin Ala Asp He Leu Gly Pro Cys Val Lys Val Arg Arg Ser 
595 600 ~ 605 

Pro Thr Ala Glu Cys Thr Ala Leu Gly Ala Ala He Ala Ala Asn Met 
610 615 620 

Ala Phe Lys Asp Val Asn Glu Arg Pro Leu Trp Lys Asp Leu His Asp 
625 630 635 640 

Val Lys Lys Trp Val Phe Tyr Asn Gly Met Glu Lys Asn Glu Gin He 
645 650 655 

Ser Pro Glu Ala His Pro Asn Leu Lys He Phe Arg Ser Glu Ser Asp 
660 665 670 

Asp Ala Glu Arg Arg Lys His Trp Lys Tyr Trp Glu Val Ala Val Glu 
67 5 68 0 68 5 

Arg Ser Lys Gly Trp Leu Lys Asp He Glu Gly Glu His Glu Gin Val 
690 695 700 

Leu Glu Asn Phe Gin 
705 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12145 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
-(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: PHK28-26 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GTCGACCACC ACGGTGGTGA CTTTAATGCC GCTCTCATGC AGCAGCTCGG TGGCGGTCTC 60 

AAAATT CAG G ATGTCGCCGG TATAGTTTTT GATAATCAGC AAGACGCCTT CGCCGCCGTC 12 0 

AATTTGCATC GCGCATTCAA ACATTTTGTC CGGCGTCGGC GAGGTGAATA TTTCCCCCGG 18 0 

ACAGGCGCCG GAG AG CAT G C CCT GGCCGAT ATAGCCGCAG TGCATCGGTT CATGTCCGCT 24 0 

GCCGCCGCCG GAGAGCAGGG CCACCTTGCC AGCCACCGGC GCGTCGGTGC GGGTCACATA 300 

CAGCGGGTCC TGATGCAGGG TCAGCTGCGG ATGGGCTTTA GCCAGCCCCT GTAATTGTTC 360 

ATTCAGTACA TCTTCAACAC GGTTAATCAG CTTTTTCATT ATTCAGTGCT CCGTTGGAGA 42 0 

AGGTTCGATG CCGCCTCTCT GCTGGCGGAG GCGGTCATCG CGTAGGGGTA TCGTCTGACG 480 

GTGGAGCGTG CCTGGCGATA TGATGATTCT GGCTGAGCGG ACGAAAAAAA GAATGCCCCG 54 0 

ACGATCGGGT T T CAT T AC G A AACATTGCTT CCTGATTTTG TTTCTTTATG GAACGTTTTT 600 

G CT GAG GAT A TGGTGAAAAT GCGAGCTGGC GCGCTTTTTT TCTTCTGCCA TAAGCGGCGG 660 

T CAG GAT AG C CGGCGAAGCG GGTGGGAAAA AATTTTTTGC TGATTTTCTG CCGACTGCGG 72 0 

GAGAAAAGGC GGTCAAACAC GGAGGATTGT AAGGGCATTA TGCGGCAAAG GAGCGGATCG 7 80 

GGATCGCAAT CCT G ACAGAG ACTAGGGTTT TTTGTTCCAA TATGGAACGT AAAAAATTAA 84 0 

CCTGTGTTTC ATATCAGAAC AAAAAGGCGA AAGATTTTTT TGTTCCCTGC CGGCCCTACA 900 

GTGATCGCAC TGCTCCGGTA CGCTCCGTTC AGGCCGCGCT TCACTGGCCG GCGCGGATAA 960 

CGCCAGGGCT CATCATGTCT ACATGCGCAC TTATTTGAGG GTGAAAGGAA TGCTAAAAGT 102 0 

TATTCAATCT CCAGCCAAAT ATCTTCAGGG TCCTGATGCT GCTGTTCTGT TCGGTCAATA 108 0 

TGCCAAAAAC CTGGCGGAGA GCTTCTTCGT CATCGCTGAC GATTTCGTAA TGAAGCTGGC 114 0 

GGGAGAGAAA GTGGTGAATG GCCTGCAGAG C CAC GAT ATT CGCTGCCATG CGGAACGGTT 12 00 

TAACGGCGAA TGCAGCCATG CGGAAATCAA CCGTCTGATG GCGATTTTGC AAAAAC AG G G 12 60 

CTGCCGCGGC GTGGTCGGGA TCGGCGGTGG TAAAACCCTC GAT AC C G C GA AGGCGATCGG 132 0 

TTAC T AC CAG AAGCTGCCGG TGGTGGTGAT CCCGACCATC GCCTCGACCG ATGCGCCAAC 13 8 0 

CAGCGCGCTG TCGGTGATCT ACACCGAAGC GGGCGAGTTT GAAGAGT AT C T GAT CT AT C C 14 4 0 

GAAAAACCCG GATATGGTGG T GAT GGAC AC GGCGATTATC GCCAAAGCGC CGGTACGCCT 1500 
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GCTGGTCTCC GGCATGGGCG ATGCGCTCTC CACCTGGTTC GAGGCCAAAG CTTGCTACGA 1560 

TGCGCGCGCC AC C AG CAT G G CCGGAGGACA GTCCACCGAG GCGGCGCTGA GCCTCGCCCG 1620 

CCTGTGCTAT GAT AC G CT G C TGGCGGAGGG CGAAAAGGCC CGTCTGGCGG CGCAGGCCGG 1680 

GGTAGTGACC GAAGCGCTGG AGC G CAT CAT CGAGGCGAAC ACTTACCTCA GCGGCATTGG 17 4 0 

CTTTGAAAGC AGTGGCCTGG CCGCTGCCCA TGCAATCCAC AACGGTTTCA CCATTCTTGA 1800 

AGAGT G C CAT CACCTGTATC ACGGTGAGAA AGTGGCCTTC GGTACCCTGG CGCAGCTGGT 1860 

GCTGCAGAAC AGCCCGATGG AC GAG ATT G A AACGGTGCAG GGCTTCTGCC AGCGCGTCGG 1920 

CCTGCCGGTG ACGCTCGCGC AGATGGGCGT ' CAAAGAGGGG AT C G AC GAGA AAATCGCCGC 198 0 

GGTGGCGAAA GCTACCTGCG CGGAAGGGGA AACCATCCAT AATATGCCGT TTGCGGTGAC 204 0 

CCCGGAGAGC GTCCATGCCG CTATCCTCAC CGCCGATCTG TTAGGCCAGC AGTGGCTGGC 2100 

GCGTTAATTC GCGGTGGCTA AACCGCTGGC CCAGGTCAGC GGTTTTTCTT TCTCCCCTCC 2160 

GGCAGTCGCT GCCGGAGGGG TTCTCTATGG TACAACGCGG AAAAG GAT AT GACTGTTCAG 2220 

ACTCAGGATA CCGGGAAGGC GGTCTCTTCC GTCATTGCCC AGTCATGGCA CCGCTGCAGC 2280 

AAGTTTATGC AGCGCGAAAC CTGGCAAACG CCGCACCAGG CCCAGGGCCT GACCTTCGAC 234 0 

TCCATCTGTC GGCGTAAAAC CGCGCTGCTC ACCATCGGCC AGGCGGCGCT GGAAGACGCC 24 0 0 

TGGGAGTTTA TGGACGGCCG CCCCTGCGCG CTGTTTATTC TTGATGAGTC. CGCCTGCATC 24 60 

CTGAGCCGTT GCGGCGAGCC GCAAACCCTG GCCCAGCTGG CTGCCCTGGG ATTTCGCGAC 2520 

GGCAGCTATT GTGCGGAGAG CATTATCGGC ACCTGCGCGC TGTCGCTGGC CGCGATGCAG 258 0 

GGCCAGCCGA TCAACACCGC CGGCGATCGG CATTTTAAGC AGGCGCTACA GCCATGGAGT 2 64 0 

TTTTGCTCGA CGCCGGTGTT TGATAACCAC GGGCGGCTGT TCGGCTCTAT CTCGCTTTGC 27 00 

TGTCTGGTCG AG C AC C AGT C CAGCGCCGAC CTCTCCCTGA CGCTGGCCAT CGCCCGCGAG 27 60 

GTGGGTAACT CCCTGCTTAC CGACAGCCTG CTGGCGGAAT CCAACCGTCA CCTCAATCAG 2 82 0 

ATGTACGGCC TGCTGGAGAG CAT G GAC GAT GGGGTGATGG CGTGGAACGA ACAGGGCGTG 288 0 

CTGCAGTTTC TCAATGTTCA GGCGGCGAGA CTGCTGCATC TTGATGCTCA GGCCAGCCAG 294 0 

GGGAAAAATA TCGCCGATCT GGTGACCCTC CCGGCGCTGC TGCGCCGCGC CATCAAACAC 3000 

GCCCGCGGCC TGAATCACGT CGAAGTCACC TTTGAAAGTC AG CAT CAGT T TGTCGATGCG 3060 

GTGATCACCT T AAAAC C GAT TGTCGAGGCG CAAGGCAACA GTTTTATTCT GCTGCTGCAT 3120 

CCGGTGGAGC AGAT GCGGCA G CT GAT GAC C AGCCAGCTCG GTAAAGTCAG CCACACCTTT 3180 
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GAGCAGATGT CTGCCGACGA TCCGGAAACC CGACGCCTGA TCCACTTTGG CCGCCAGGCG 324 0 

GCGCGCGGCG GCTTCCCGGT GCTACTGTGC GGCGAAGAGG GGGTCGGGAA AGAGCTGCTG 3300 

AGCCAGGCTA TTCACAATGA AAGCGAACGG GCGGGCGGCC CCTACATCTC CGTCAACTGC 3360 

CAG C TAT AT G CCGACAGCGT GCTGGGCCAG GACTTTATGG GCAGCGCCCC TACCGACGAT 3420 

GAAAATGGTC GCCTGAGCCG CCTTGAGCTG GCCAACGGCG GCACCCTGTT TCTGGAAAAG 34 80 

AT C GAGT AT C TGGCGCCGGA GCTGCAGTCG GCTCTGCTGC AGGTGATTAA GCAGGGCGTG 3540 

CTCACCCGCC TCGACGCCCG GCGCCTGATC CCGGTGGATG T GAAGGT GAT TGCCACCACC 3600 

ACCGTCGATC TGGCCAATCT GGTGGAACAG AACCGCTTTA GCCGCCAGCT GTACTATGCG 3660 

CTGCACTCCT TTGAGATCGT CATCCCGCCG CTGCGCGCCC GACGCAACAG TATTCCGTCG 3720 

CTGGTGCATA ACCGGTTGAA GAGCCTGGAG AAGCGTTTCT CTTCGCGACT GAAAGTGGAC 37 80 

GATGACGCGC TGGCACAGCT GGTGGCCTAC TCGTGGCCGG GGAATGATTT T GAG CT CAAC 384 0 

AGCGTCATTG AGAATATCGC CAT CAG CAG C GACAACGGCC ACATTCGCCT GAGTAAT CT G 3900 

CCGGAATATC TCTTTTCCGA GCGGCCGGGC GGGGATAGCG CGTCATCGCT GCTGCCGGCC 3960 

AGCCTGACTT TTAGCGCCAT CGAAAAGGAA GCTATTATTC ACGCCGCCCG GGTGACCAGC 4 02 0 

GGGCGGGTGC AG GAG AT GT C GCAGCTGCTC AATATCGGCC GCACCACCCT GTGGCGCAAA 4 08 0 

AT GAAGCAGT ACGATATTGA CGCCAGCCAG TTCAAGCGCA AG CAT CAG G C CTAGTCTCTT 414 0 

CGATTCGCGC CATGGAGAAC AGGGCATCCG ACAGGCGATT GCTGTAGCGT TTGAGCGCGT 4200 

CGCGCAGCGG ATGCGCGCGG TCCATGGCCG TCAGCAGGCG TTCGAGCCGA CGGGACTGGG 42 60 

TGCGCGCCAC GTGCAGCTGG GCAGAGGCGA GATTCCTCCC CGGGATCACG AACTGTTTTA 4 320 

ACGGGCCGCT CTCGGCCATA TTGCGGTCGA TAAGCCGCTC CAGGGCGGTG ATCTCCTCTT 438 0 

CGCCGATCGT CTGGCTCAGG CGGGTCAGGC CCCGCGCATC GCTGGCCAGT TCAGCCCCCA 444 0 

G C AC GAACAG CGTCTGCTGA AT AT G GT G CA GGCTTTCCCG CAGCCCGGCG TCGCGGGTCG 4 500 

TGGCGTAGCA GACGCCCAGC TGGGAT AT CA GTTCATCGAC GGTGCCGTAG GCCTCGACGC 4560 

GAATATGGTC TTTCTCGATG CGGCTGCCGC CGTACAGGGC GGTGGTGCCT TTATCCCCGG 4 620 

TGCGGGTATA GAT AC GAT AC ATTCAGTTTC TCTCACTTAA CGGCAGGACT TTAACCAGCT 4 68 0 

GCCCGGCGTT GGCGCCGAGC GTACGCAGTT GATCGTCGCT ATCGGTGACG TGTCCGGTAG 47 4 0 

CCAGCGGCGC GTCCGCCGGC AGCTGGGCAT GAGTGAGGGC TATCTCGCCG GACGCGCTGA 4 800 

GCCCGATACC CACCCGCAGG GGCGAGCTTC TGGCCGCCAG GGCGCCCAGC GCAGCGGCGT 4 860 
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CACCGCCTCC 


GTCATAGGTT . 


ATGGTCTGGC 


AGGGGACCCC 


CTGCTCCTCC . 


AGCCCCCAGC 


4920 


AC AG CT CAT T 


GATGGCGCCG 


GCATGGTGCC 


CGCGCGGATC 


GTAAAACAGG 


CGTACGCCTG 


4980 


GCGGTGAAAG 


C G AC AT G AC G 


GTCCCCTCGT 


TAACACTCAG 


AATGCCTGGC 


GGAAAATCGC 


5040 


GGCAATCTCC 


TGCTCGTTGC 


CTTTACGCGG 


GTT CGAGAAC 


GCATTGCCGT 


CTTTTAGAGC 


5100 


CATCTCCGCC 


ATGTAGGGGA AGTCGGCCTC 


TTTTACCCCC 


AGATCGCGCA 


GATGCTGCGG 


5160 


AAT AC C GAT A 


TCCATCGACA 


GACGCGTGAT 


AGCGGCGATG 


GCTTTTTCCG 


CCGCGTCGAG 


5220 


AGTGGACAGT 


CCGGTGATAT 


TTTCGCCCAT 


CAGTTCAGCG 


ATATCGGCGA 


ATTTCTCCGG 


5280 


GTTGGCGATC 


AGGTTGTAGC 


GCGCCACATG 


CGGCAGCAGG 


ACAGCGTTGG 


CCACGCCGTG 


5340 


CGGCATGTCG 


TACAGGCCGC 


CCAGCTGGTG 


CGCCATGGCG 


TGCACGTAGC 


CGAGGTTGGC 


5400 


GTTATTGAAA 


GCCATCCCGG 


CCAGCAGAGA 


AGCATAGGCC 


ATGTTTTCCC 


GCGCCTGCAG 


5460 


ATTGCTGCCG 


AGGGCCACGG 


CCTGGCGCAG 


GTTGCGGGCG 


ATGAGGCGGA 


TCGCCTGCAT 


5520 


GGCGGCGGCG 


TCCGTCACCG 


GGTTAGCGTC 


TT T GGAGAT A 


TAGGCCTCTA 


CGGCGTGGGT 


5580 


CAGGGCATCC 


ATCCCGGTCG 


CCGCGGTCAG 


GGCGGCCGGT 


TT AC C GAT CA 


TCAGCAGTGG 


5640 


AT C GT T GAT A 


GAGACCGACG 


GCAGTTTGCG 


CCAGCTGACG 


AT C AC AAAC T 


TCACTTTGGT 


5700 


TTCGGTGTTG 


GTCAGGACGC 


AGTGGCGGGT 


GACCTCGCTG 


GCGGTGCCGG 


CGGTGGTATT 


5760 


GACCGCGACG 


ATAGGCGGCA 


GCGGGTTGGT 


CAGGGTCTCG 


ATTCCGGCAT 


AC T G GT AC AG 


5820 


ATCGCCCTCA 


TGGGTGGCGG 


CGATGCCGAT 


GCCTTTGCCG 


CAATCGTGCG 


GGCTGCCGCC 


5880 


GCCCACGGTG 


AC GAT GAT GT 


CGCACTGTTC 


GCGGCGAAAC 


ACGGCGAGGC 


CGTCGCGCAC 


5940 


GTTGGTGTCT 


TTCGGGTTCG 


GCTCGACGCC 


GTCAAAGATC 


GCCACCTCGA 


TCCCGGCCTC 


6000 


CCGCAGATAA 


TGCAGGGTTT 


TGTCCACCGC 


GCCATCTTTA 


ATTGCCCGCA 


GGCCTTTGTC 


6060 


GGTGACCAGC 


AGGGCTTTTT 


TCCCCCCCAG 


CAGCTGGCAG 


CGTTCGCCGA 


CTAC GGAAAT 


6120 


GGCGTTGGGG 


CCAAAAAAGT 


TAACGTTTGG 


C AC C AG AT AA 


T CAAAC AT AC 


GATAGCTCAT 


6180 


AAT AT AC C TT 


CTCGCTTCAG 


GTTATAATGC 


GGAAAAACAA 


TCCAGGGCGC 


ACTGGGCTAA 


6240 


TAATTGATCC 


TGCTCGACCG 


TACCGCCGCT 


AACGCCGACG 


GCGCCAATTA 


CCTGCTCATT 


6300 


AAAAATAACT 


GGCAGGCCGC 


CGCCAAAAAT 


AATAATTCGC 


TGTTGGTTGG 


TTAGCTGCAG 


6360 


AC C GTACAGA GATTGTCCTG 


GCTGGACCGC 


TGACGTAATT 


TCATGGGTAC 


CTTGCTTCAG 


6420 


GCTGCAGGCG 


; CTCCAGGCTT 


TAT T C AG G G A 


. AAT AT C G C AG 


CTGGAGACGA 


. AGGCCTCGTC 


6480 


CATCCGCTGG ATAAGCAGCG 


TGTTGCCTCC 


: GCGGTCAACT 


ACGGAAAACA 


. CCACCGCCAC 


6540 
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GTTGATCTCA GTGGCTTTTT TTTCCACCGC CGCCGCCATT TGCTGGGCGG CGGCCAGGGT 6600 

GATTGTCTGA ACTTGTTGGC TCTTGTTCAT CATTCTCTCC CGCACCAGGA TAACGCTGGC 6660 

GCGAATAGTC AGTAGGGGGC GATAGTAAAA AACTATTACC ATTCGGTTGG CTTGCTTTAT 6720 

TTTTGTCAGC GTTATTTTGT CGCCCGCCAT GATTTAGTCA ATAGGGTTAA AATAGCGTCG 67 8 0 

GAAAAACGTA ATTAAGGGCG TTTTTTATTA ATTGATTTAT ATCATTGCGG GCGATCACAT 684 0 

TTTTTATTTT TGCCGCCGGA GTAAAGTTTC ATAGTGAAAC TGTCGGTAGA TTTCGTGTGC 6900 

CAAATTGAAA C GAAATT AAA TTTATTTTTT TCACCACTGG CTCATTTAAA GTTCCGCTAT 6960 

TGCCGGTAAT GGCCGGGCGG CAACGACGCT GGCCCGGCGT ATTCGCTACC GTCTGCGGAT 7 02 0 

TTCACCTTTT GAGCCGATGA ACAAT GAAAA GATCAAAACG ATTTGCAGTA CTGGCCCAGC 7 08 0 

GCCCCGTCAA TCAGGACGGG CTGATTGGCG AGTGGCCTGA AGAGGGGCTG AT CGC CAT GG 714 0 

ACAGCCCCTT TGACCCGGTC TCTTCAGTAA AAGTGGACAA CGGTCTGATC GTCGAACTGG 7200 

ACGGCAAACG CCGGGACCAG TTTGACATGA TCGACCGATT TAT C GC C GAT T AC G C GAT C A 72 60 

ACGTTGAGCG C ACAGAG C AG GCAATGCGCC TGGAGGCGGT GGAAATAGCC CGTATGCTGG 732 0 

T G GAT AT T CA CGTCAGCCGG GAG GAG AT C A TTGCCATCAC TACCGCCATC ACGCCGGCCA 7380 

AAGCGGTCGA GGTGATGGCG CAGATGAACG TGGTGGAGAT GATGATGGCG CTGCAGAAGA 74 4 0 

TGCGTGCCCG CCGGACCCCC TCCAACCAGT GCCACGTCAC CAATCTCAAA GATAATCCGG 7500 

TGCAGATTGC CGCTGACGCC GCCGAGGCCG GGATCCGCGG CTTCTCAGAA CAGGAGACCA 7560 

CGGTCGGTAT CGCGCGCTAC GCGCCGTTTA ACGCCCTGGC GCTGTTGGTC GGTTCGCAGT 7 62 0 

GCGGCCGCCC CGGCGTGTTG ACGCAGTGCT CGGTGGAAGA GGCCACCGAG CTGGAGCTGG 7680 

GCATGCGTGG CTTAACCAGC TACGCCGAGA CGGTGTCGGT CTACGGCACC GAAGCGGTAT 774 0 

TTACCGACGG C GAT GAT AC G CCGTGGTCAA AGGCGTTCCT CGCCTCGGCC TACGCCTCCC 7800 

GCGGGTTGAA AATGCGCTAC ACCTCCGGCA CCGGATCCGA AGCGCTGATG GGCTATTCGG 7 8 60 

AG AG CAAGT C GATGCTCTAC CTCGAATCGC GCTGCATCTT CATTACTAAA GGCGCCGGGG 7 92 0 

TTCAGGGACT GCAAAACGGC GCGGTGAGCT GTATCGGCAT GACCGGCGCT GTGCCGTCGG 7980 

GCATTCGGGC GGTGCTGGCG GAAAAC CT G A TCGCCTCTAT GCTCGACCTC GAAGTGGCGT 804 0 

CCGCCAACGA CCAGACTTTC TCCCACTCGG ATATTCGCCG CACCGCGCGC ACCCTGATGC 8100 

AGATGCTGCC GGGCACCGAC TTTATTTTCT CCGGCTACAG CGCGGTGCCG AACTACGACA 8160 

ACATGTTCGC CGGCTCGAAC TTCGATGCGG AAGATTTTGA TGATTACAAC ATCCTGCAGC 8220 
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GTGACCTGAT GGTTGACGGC 


GGCCTGCGTC 


CGGTGACCGA 


GGCGGAAACC 


ATTGCCATTC 


8280 


GCCAGAAAGC GGCGCGGGCG 


ATCCAGGCGG 


TTTTCCGCGA 


GCTGGGGCTG 


CCGCCAATCG 


8340 


CCGACGAGGA GGTGGAGGCC 


GCCACCTACG 


CGCACGGCAG 


CAACGAGATG 


CCGCCGCGTA 


8400 


ACGTGGTGGA GGATCTGAGT 


GCGGTGGAAG 


AGATGATGAA 


GCGCAACATC 


ACCGGCCTCG 


8460 


ATATTGTCGG CGCGCTGAGC 


CGCAGCGGCT 


TT GAGGAT AT 


CGCCAGCAAT 


ATTCTCAATA 


8520 


TGCTGCGCCA GCGGGTCACC 


GGCGATTACC 


TGCAGACCTC 


GGCCATTCTC 


GATCGGCAGT 


8580 


TCGAGGTGGT GAGTGCGGTC 


AACGACATCA 


AT GACTAT C A 


GGGGCCGGGC 


ACCGGCTATC 


8640 


GCATCTCTGC CGAACGCTGG 


GCGGAGATCA 


AAAATATTCC 


GGGCGTGGTT 


CAGCCCGACA 


8700 


CCATTGAATA AGGCGGTATT 


CCTGTGCAAC 


AGACAACCCA 


AATTCAGCCC 


TCTTTTACCC 


8760 


TGAAAACCCG CGAGGGCGGG 


GTAGCTTCTG 


CCGATGAACG 


CGCCGATGAA 


GTGGTGATCG 


8820 


GCGTCGGCCC TGCCTTCGAT 


AAAC AC C AG C 


ATCACACTCT 


GAT C GAT AT G 


CCCCATGGCG 


8880 


CGATCCTCAA AGAGCTGATT 


GCCGGGGTGG 


AAGAAGAGGG 


GCTTCACGCC 


CGGGTGGTGC 


8940 


GCATTCTGCG CACGTCCGAC 


GTCTCCTTTA 


TGGCCTGGGA 


TGCGGCCAAC 


CTGAGCGGCT 


9000 


CGGGGATCGG CATCGGTATC 


CAGTCGAAGG 


GGACCACGGT 


CAT C CAT C AG 


CGCGATCTGC 


9060 


TGCCGCTCAG CAACCTGGAG 


CTGTTCTCCC 


AGGCGCCGCT 


GCTGACGCTG 


GAGACCTACC 


9120 


GGCAGATTGG CAAAAACGCT 


GCGCGCTATG 


CGCGCAAAGA 


GTCACCTTCG 


CCGGTGCCGG 


9180 


TGGTGAACGA TCAGATGGTG 


CGGCCGAAAT 


TTATGGCCAA 


AGCCGCGCTA 


TT T CAT AT C A 


9240 


AAGAGAC CAA ACATGTGGTG 


CAGGACGCCG 


AGCCCGTCAC 


CCTGCACATC 


GACTTAGTAA 


9300 


GGGAGT GACC AT GAG C GAGA AAACCATGCG 


CGTGCAGGAT 


TATCCGTTAG 


CCACCCGCTG 


9360 


CCCGGAGCAT ATCCTGACGC 


CTACCGGCAA 


ACCATTGACC 


GAT ATT AC C C 


TCGAGAAGGT 


9420 


GCTCTCTGGC GAGGTGGGCC 


CGCAGGATGT 


GCGGATCTCC 


CGCCAGACCC 


TTGAGTACCA 


9480 


GGCGCAGATT GCCGAGCAGA 


TGCAGCGCCA 


TGCGGTGGCG 


CGCAATTTCC 


GCCGCGCGGC 


9540 


GGAGCTTATC GCCATTCCTG 


AC GAG C G CAT 


TCTGGCTATC 


TATAACGCGC 


TGCGCCCGTT 


9600 


CCGCTCCTCG CAGGCGGAGC 


TGCTGGCGAT 


CGCCGACGAG 


CTGGAGCACA 


CCTGGCATGC 


9660 


GACAGT GAAT GCCGCCTTTG 


TCCGGGAGTC 


GGCGGAAGTG 


TAT C AG GAG C 


GGCATAAGCT 


9720 


GCGTAAAGGA AGCTAAGCGG AGGTCAGCAT 


GCCGTTAATA 


GCCGGGATTG 


AT AT C G G CAA 


9780 


CGCCACCACC GAGGTGGCGC 


TGGCGTCCGA 


CTACCCGCAG 


GCGAGGGCGT 


TTGTTGCCAG 


9840 


CGGGATCGTC GCGACGACGG 


G CAT G AAAG G 


GACGCGGGAC 


AATATCGCCG 


GGACCCTCGC 


9900 



WO 98/21341 PCTYUS97/20873 

- 86 



CGCGCTGGAG CAGGCCCTGG CGAAAACACC GTGGTCGATG AGCGATGTCT CTCGCATCTA 9960 

TCTTAACGAA GCCGCGCCGG TGATTGGCGA TGTGGCGATG GAGAC CAT C A CC GAGAC CAT 10020 

TATCACCGAA TCGACCATGA TCGGTCATAA CCCGCAGACG CCGGGCGGGG TGGGCGTTGG 1008 0 

CGTGGGGACG ACTATCGCCC TCGGGCGGCT GGCGACGCTG CCGGCGGCGC AGTATGCCGA 1014 0 

GGGGTGGATC GT ACT GAT T G ACGACGCCGT CGATTTCCTT GACGCCGTGT GGTGGCTCAA 10200 

TGAGGCGCTC GACCGGGGGA TCAACGTGGT GGCGGCGATC CTCAAAAAGG ACGACGGCGT 102 60 

GCTGGTGAAC AACCGCCTGC GTAAAACCCT GCCGGTGGTG GATGAAGTGA CGCTGCTGGA 10320 

GCAGGTCCCC GAGGGGGTAA TGGCGGCGGT GGAAGTGGCC GCGCCGGGCC AGGTGGTGCG 10380 

GATCCTGTCG AATCCCTACG GGATCGCCAC CTTCTTCGGG CTAAGCCCGG AAG AG AC C C A 10440 

GGCCATCGTC CCCATCGCCC GCGCCCTGAT TGGCAACCGT TCCGCGGTGG T G C T C AAG AC 10500 

CCCGCAGGGG GATGTGCAGT CGCGGGTGAT CCCGGCGGGC AACCTCTACA TTAGCGGCGA 10560 

AAAGCGCCGC GGAGAGGCCG ATGTCGCCGA GGGCGCGGAA GCCATCATGC AG G C GAT GAG 10620 

CGCCTGCGCT CCGGTACGCG ACATCCGCGG CGAACCGGGC ACCCACGCCG GCGGCATGCT 10680 

TGAGCGGGTG CGCAAGGTAA TGGCGTCCCT GACCGGCCAT GAGAT GAG C G C GAT AT AC AT 10740 

C C AG GAT CT G CTGGCGGTGG ATACGTTTAT TCCGCGCAAG GTGCAGGGCG GGATGGCCGG 108 00 

CGAGTGCGCC AT GGAGAAT G CCGTCGGGAT GGCGGCGATG GTGAAAGCGG ATCGTCTGCA 10860 

AATGCAGGTT ATCGCCCGCG AACTGAGCGC CCGACTGCAG ACCGAGGTGG TGGTGGGCGG 10920 

CGTGGAGGCC AACATGGCCA TCGCCGGGGC GTTAACCACT CCCGGCTGTG CGGCGCCGCT 1098 0 

GGCGATCCTC GACCTCGGCG CCGGCTCGAC GGATGCGGCG ATCGTCAACG CGGAGGGGCA 11040 

GATAACGGCG GTCCATCTCG CCGGGGCGGG GAATATGGTC AGCCTGTTGA TTAAAACCGA 11100 

GCTGGGCCTC GAGGAT CTTT CGCTGGCGGA AG C G AT AAAA AAATACCCGC TGGCCAAAGT 11160 

GGAAAGCCTG TTCAGTATTC GTCACGAGAA TGGCGCGGTG GAGTTCTTTC GGGAAGCCCT 11220 

CAGCCCGGCG GTGTTCGCCA AAGTGGTGTA CATCAAGGAG GGCGAACTGG TGCCGATCGA 112 8 0 

TAACGCCAGC CCGCTGGAAA AAATTCGTCT CGTGCGCCGG CAGGCGAAAG AGAAAGT GTT 1134 0 

T GT C AC CAAC TGCCTGCGCG CGCTGCGCCA GGTCTCACCC GGCGGTTCCA TTCGCGATAT 11400 

CGCCTTTGTG GTGCTGGTGG GCGGCTCATC GCTGGACTTT GAGAT CCCGC AGCTTATCAC 114 60 

GGAAGCCTTG TCGCACTATG GCGTGGTCGC CGGGCAGGGC AATATTCGGG GAACAGAAGG 11520 

GCCGCGCAAT GCGGTCGCCA CCGGGCTGCT ACTGGCCGGT CAGGCGAATT AAACGGGCGC 11580 
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TCGCGCCAGC CTCTCTCTTT AACGTGCTAT TTCAGGATGC CGATAATGAA CCAGACTTCT 11640 
ACCTTAACCG GGCAGTGCGT GGCCGAGTTT CT.TSGCACCG GATTGCTCAT JTT.CXTCGGC. 11700 
:^CGT3GCTGCG TCGCTGCGCT GCGGGTCGCC GGGGCCACTTT TTCGTCAGTG: :GGAGATCAGT " 11760 
rATTATCTGGG'.-GCCTTGGCGT- CGCCATGGCC r^TCTACCTGA^ CGGCCGGTGT CTCCGGCGCG ...11820 
CACCTAAATC CGGCGGTGAC CATTGCCCTG TGGCTGTTCG CCTGTTTTGA "ACGCCGCAAG .11880 
GTGCTGCCGT TTATTGTTGC CCAGACGGCC GGGGCCTTCT GCGCCGCCGC GCTGGTGTAT 1194 0 
GGGCTCTATC GCCAGCTGTT TCTCGATCTT GAACAGAGTC AG CAT AT C GT GCGCGGCACT 12 000 
GCCGCCAGTC TTAACCTGGC CGGGGTCTTT TCCACGTACC CGCATCCACA TATCACTTTT 12 060 
; ATACAAGCGT TTGCCGTGGA GACCACCOTC"ACGGCAATCC "T GAT GGCGAT T5ATCATGGCC : -12120 
:.CTGACCGACG ACGGCAACGG AATTC _ 12145 

. (2) ■ INFORMATION FOR SEQ ID-NO:20: 

~{i) SEQUENCE CHARACTERISTICS-: 

(A) LENGTH: 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

;{ii) MOXiECTTLE TYPE : DNA (genomic) 
(xi).- SEQUENCE DESCRIPTION: SEQ ID NO:20: 
- AGCTTAGGAG TCTAGAATAT TGAGCTCGAA TTCCCGGGCA TGCGGTACCG GATCCAGAAA 60 



* "AAAGCCCGCA CCTGACAGTG- CGGGCTTTTT -TTTT 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
ID.) * TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

. ( X i) SEQUENCE DESCRIPTION: SEQ ID NO:21 

GGAATTCAGA T CT CAGCAAT GAG C GAG AAA ACCATGC 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 



94 



37 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vi ). ~ SEQUENCE: DESURTPTI ON:- 5EQ ID NO:22 : 
GCTCTAGKTT AGCTTCCTTT ACGCAGC 
(2) . . INFORMATION FOR SEQ ID JJO:23: 

■ (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 



: ill r - '7MOIIECULE 'TYPE : DNA • ( gen omi c ) 



(xiV / -SEQUENCE ""DESCRIPTION: 7 " SEQ ID NO: 23: 
GGCCAAGCTT AAGGAGGTTA.TATTAAATGAA AAG 



12) :.XNE015MATT:ON. FOR SEQ ID NOj24: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY 2 ~. -linear 

(ii)'. MOI^ECULE TYPE: DNA (genomic) 

(xi) . SEQUENCE .DESCRIPTION: SEQ ID NO: 24: 



GCTCTAGATT . ATT CAAT GGT GTCGGG 



(21 " "INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 
- . - (B) TYPE: nucleic acid 
. (C) STRANDEDNESS: single 
ID) . TOPOLOGY': JLLnear 

£±±JV ^IDIiECUIE TYPE: :*: DNA (genomic) 



J (3Ll) " -/-SEQUENCE .DESCRIPTION : -SEQ ID NO_:25: 



GCGCCGTCTA GAATTAT GAG CTATCGTATG TTTG ATT AT C TG 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
T: (xiT ^SEQUENCE ™*rsBtTPTTCaii~' SEQ. ZD-NO: 26:: 
VTCTGATACGG ..GRT.CCTCAGA ,AT£CCT£GCG GAAAAT 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii)* - MOLECULE . TYPE : .DNA (genomic) 
<xD SEQUENCE DESCRIPTION : „ SEQ ID NO: 27: 
.GCGCGGATCC. :aggagtctag AATTATGGGA TTGACTACTA AACCTCTATC T 
[2Y-- INFORMATION .FOR .SEQ XD_.NO:.28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

... ■ (xi) - SEQUENCE DESCRIPTION: SEQ "ID NO:28: 
: -GATACGCCCG'GGTTACCATT TCAACAGATC GTCCTT 
(2) INFORMATION FOR SEQ ID~NO:.29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY:- linear 

. I . (ii ) MOLECULE TYPE : DNA (genomic) £ *. 

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO:29: 
TCGACGAATT CAGGAGGA - 
(2) INFORMATION FOR SEQ ID NO: 30: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(ii) 



MOLECULE TYPE: DNA (genomic) 



SEQUENCE DESCRIPTION iT!", SEQ. ID NO: 30: 



TCTAGT CCTCC TGAATTCG 



(2) " . .' .1 NFORMAT.I ON FOR. SEQ ID N0:31: 

' ^ (i) SEQUENCE- CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

- (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

l±±) S J4QLECUI*E .TYPE: _.DNA. (genomic) 

(xi ). - SEQUENCE" DESCRIPTION: * SEQ ID NO:31: 

. . CTAGTAAGGA GGACAATTC - 19 

{2Y . :\XNFOFMATION EOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY.: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
. (xiV SEQUENCE DESCRIPTION : SEQ ID NO:32: 
r.CATGGAATT^ TCCTCCTTA . -19 

-12) -INFORMATION FOR SEQ "ID " NO: 33 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 271 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

" :~i£iT) .MOLECULE ".TYPE: * pxot:«in 

. ( vi ) ORI GINAL SOURCE : 

' (A). ORGANISM: GPP1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Lys Arg Phe Asn Val Leu Lys Tyr lie Arg Thr Thr Lys Ala Asn 
1.5 10 15 



lie Gin Thr lie Ala Met Pro Leu Thr Thr Lys Pro Leu Ser Leu Lys 
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20 25 30 

He Asn Ala Ala Leu Phe Asp Val Asp Gly Thr He He He Ser Gin 
35 40 45 

Pro Alg Tie Ala Ala Phe Trp Arg Asp Phe Gly- Lys Asp. lys Pro Tyr 
50 "55 60 

Phe Asp Ala Glu His Val He , His lie Ser His; Gly ..Txp Arg -Thr Tyr 
65 -7.0 75 80 

Asp Ala He Ala Lys Phe Ala Pro Asp Phe Ala Asp Glu Glu Tyr Val 
85 90 95 

Asn Lys Leu Glu Gly Glu He Pro Glu Lys Tyr Gly Glu His Ser He 
100 105 HO 

. Glu Val Pro Gly Ala-Val liys Xeia Cys Asxi-Al;a.:LeTi AsTilAla Leu. Pro 
115 - . - - 120 - 125 

Lys Glu Lys Trp Ala Val Ala Thr Ser Gly Thr Arg Asp Met Ala Lys 
.130 135 140 

iys Trp -Phe Asp .lle.„Leu .Lys.. lie Lys.-Arg-.Pxo_ Glu . Tyr Phe lie Thr 
14 5 " - • 150 * 155 160 

Ala Asn Asp Val Lys Gin Gly Lys Pro His Pro Glu Pro Tyr Leu Lys 
165 170 175 

Gly Arg Asn Gly -Leu Gly Phe Pro Ile-Asn Glu Gin Asp Pro Ser Lys 
180 185 190 

Ser. Lys Val Val Val Phe Glu Asp Ala Pro Ala Gly He Ala Ala Gly 
195 200 205 

Lys* Ala Ala' Gly Cys Lys He. Val Gly lie Ala.-Thr Thr Phe Asp Leu 
210 215 220 

Asp Phe Leu Lys Glu Lys Gly Cys Asp He He Val Lys Asn His Glu 
225 230 235 240 

Ser He Arg Val Gly Glu Tyr Asn Ala Glu Thr Asp Glu Val Glu Leu 
245 250 255 

He.ZShe AspJAsp Tyr leu. Tyr Ala Xys Asp .Asp leu Leu Lys Trp 

260 ' 265 ' 270 

(2) INFORMATION FOR SEQ ID NO: 34: 

v (I) .SEQUENCE" CHARACTERISTICS: 

(A) LENGTH: 555 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHABI 

- (yi ) ~ SEQVEtiCE ^DESCRIPTION : . -SEQ . XD NO; 34 : 

Met . .Lys -Arg .Sex ..Lys Ar g * Phe Al-a. Val.JLeu Ala Gin Ax g Pro Val . Asn 

1 5 - 10 . .. . 15 

Gin Asp Gly Leu lie Gly Glu Trp Pro Glu Glu Gly Xeul lie Ala Met: 
20 .- 25 30 

Asp Ser Pr v o Phe Asp Pro Val Ser Ser Val Lys Val Asp Asn Gly Leu 
35 40 ' 45 

lie Val Glu Leu Asp Gly Lys Arg Arg Asp Gin Phe Asp Met lie Asp 

. ... 50 55 60 

.Arg Pile T Ile Ala "Asp._Tyr :Ala lie Asn Val Glu.Arg.~T.hr Glu . Gin. Ala 
65 ' 70 "75 1 .80 

.Met ivrg/Xeu,.Glu^Ala Val ...Glu. -lie. Ala Arg . Met leu Val Asp lie His 
.85 90 95 

* Val Ser Arg Glu Glu lie lie Ala He Thr Thr Ala' Ile'Thf Pro* Ala 
100 105 110 

Lys Ala Val Glu Val Met Ala Gin Met Asn Val Val Glu Met Met Met 
115 120 125 

Ala. Leu Gin Lys Met Arg Ala Arg Arg Thr Pro Ser Asn Gin Cys His 
130 135 14 0 

Val Thr- Asn . Leu Lys Asp Asn Pro Val Gin lie Ala Ala Asp Ala Ala 
.145 -150 . .. 155 160 

Glu Ala Gly "lie 'Arg Gly Phe .Ser Glu Gin^Giu .T-hr/Thr Vai Gly lie 
165 ' 170 ~ 175 

Ala Arg Tyr Ala Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gin 
180 185 190 

Cys Gly Arg Pro Gly Val Leu Thr Gin Cys Ser Val Glu Glu Ala Thr 

. 195 - - 200 .. ..205 

Glu lieu Glu Xeu' Gly Met- Arg "Gly Leu. Thr "Ser Tyr Ala Glu Thr Val 
.210 215 220 

~5er Vai :Tyr. :Gly. . Thr Glu-lAla-Val Phe Thr Asp. k . Gly ^Asp .Asp _Thr Pro 
225 230 "235 '240 

Trp Ser Lys Ala Phe Leu Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 
245 250 255 



Met Arg Tyr Thr Ser Gly Thr Gly Ser Glu Ala Leu Met Gly Tyr Ser 
260 265 270 
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Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys He Phe He Thr 
275 280 285 

Lys.. Gly. Ala -Gly .Val -Gin Gly -Leu~Gln Asn..Gly~Ala Val. Ser .Cys. .lie 
290 * 295 300 

Gly -Met Thr.'Gly ;Ala Val Pro Ser Gly lie -Arg Ala Val Leu Ala Glu 

305 - 310 . :.315_.. 320 

Asn Leu He Ala Ser Met Leu Asp Leu Glu Val -Ala Ser.Ala Asn Asp 
325 330 • ' 335 

Gin Thr Phe Ser His Ser Asp He Arg Arg Thr Ala Arg Thr Leu Met 
340 345 350 

Gin Met Leu Pro Gly Thr Asp Phe He Phe Sex Gly Tyr -Ser Ala Val 
355 -360 t 'J- 365 

. 'Pro ' Asn Tyr Asp : Asn Met Phe: Ala Gly -Ser, Asn Phe Asp Ala Glu Asp 
.370 375 38 0 

Phe Asp Asp "Tyr Asn He leu Gin Arg Asp Leu Met Val Asp Gly Gly 
3B5 k " -390 395 4 00 

Leu Arg Pro Val Thr Glu Ala Glu Thr He Ala He Arg Gin Lys Ala 
405 410 415 

Ala Arg Ala He Gin Ala Val Phe Arg Glu Leu Gly Leu Pro Pro He 
420 425 430 

Ala Asp Glu Glu Val Glu" Ala Ala Thr Tyr Ala -His Gly Ser Asn Glu 
435 440 445 

.Met-Pxo Pro- Arg Asn . Val ~Val Glu.Asp Leu. Ser Ala. Val Glu Glu Met 
450 /455 :-460 

Met Lys Arg Asn He Thr Gly Leu Asp Tie Val Gly Ala Leu Ser Arg 
4 65 470 475 4 80 

Ser Gly Phe Glu Asp He Ala Ser Asn lie Leu Asn Met Leu Arg Gin 
485 490 495 

Arg Val Thr Gly Asp Tyr Leu Gin Thr Ser Ala. lie Leu Asp Arg Gin 
.500 - .505 _ --510 

Phe Glu Val Val Ser Ala Val Asn Asp He Asn Asp Tyr Gin Gly Pro 
515 520 525 

Gly Thr Gly Tyr Arg He Ser Ala Glu Arg Trp Ala Glu He Lys Asn. 
530 - 535 540 

lie Pro Gly Val Val Gin Pro Asp Thr lie Glu 
545 550 555 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 

- " _ (C) STRANDEDNESS : unknown 

- . "(Dr -TOPOLOGY:, unknown 

<ii) " MOLECULE TYPE:- protein 

{vi ). ; ORIGINAL SOURCE: 

(A) ORGANISM: DHAB2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Met Gin Gin Thr Thr Gin lie Gin Pro Ser Phe Thr Leu Lys Thr Arg 
.15 10 15 

. Glu ..Gly' Gly: Val : Ala. Ser/ Ala Asp Glu: Arg Ala Asp Glu. Val Val.Tle 
.20 .- 25 30 

-Gly Val Gly. Pro. Ala Pl*e Asp '-Lys His ...Gin. His . His ".Thr .leu lie. Asp 
~:35 40 45 

Met~-Pro Hxs^Gly 'Ala 1 Tie 1j§tr Lys Glu Iieu"Tle*"'Ala Gly Val "Glu Glu 
50 55 - 60 

Glu Gly Leu His Ala Arg Val Val Arg lie Leu Arg Thr Ser Asp Val 
65 70 75 80 

Ser~Phe Met" Ala Trp Asp Ala: Ala Asn Xeu Sex Gly Ser Gly lie Gly 
85 90 95 

lie Gly lie Gin. Ser Lys Gly:Thr Thr Val lie His Gin Arg Asp Leu 
100 105 . 110 

..Leu. -Pro lieu Ser Asri Leu -Glu . Leu Phe Ser Gin Ala. Pxo. Xeu . Leu Thr 
115 120 125 

Leu Glu Thr Tyr Arg Gin lie Gly Lys Asn Ala Ala Arg Tyr Ala Arg 
130 135 140 

Lys Glu Ser Pro Ser Pro Val Pro Val Val Asn Asp Gin Met Val Arg 
.145 150 - . . . .155 _ 160 

?jo.."Lys: ;?he Met Ala -Xys -Ala .Ala.rieu ^Piie. His Hie" Xys Glu Thr ..Lys 
165 170 175 

.His "Val Val Gin. Asp Ala-.Glu_£ro Val .Thr .Leu -His -Tie. Asp. leu" „Val 
~ ■ 180 . 1B5 _ ISO 



Arg Glu 



WO 98/21341 



PCT/US97/20873 



95 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 0 amino acids 

(B) TYPE;- amino acid 

(C) . STRANDKDNESS: unknown 
' ^ (D) TOPOLOGY: unknown 

.... <±i). MOLECULE TYPE: protein 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: DHAB3 

(xi) SEQUENCE DESCRIPTION: SEQ ID KO:36: 

Met Ser Glu Lys Thr Met Arg Val Gin Asp Tyr Pro Leu Ala Thr Arg 
15 10 15 

-Cys .Pro„Glu His lie XeuV.Thr Pxo'TThr . Gly. Lys Pro Leu "Thr Asp lie 
20 , 25 30 

.JThr^Leu Glu Lys Val Leu Ser Gly. Glu Val Gly :Pro Gin . Asp Val Arg 
35 40 45 

lie Ser ' Arg Gin JThr Leu GIvl Tyr Gin Ala Gin lie Ala Glu Gin Met 
50 55 60 

Gin His Ala Val Ala Arg Asn Phe Arg Arg Ala Ala Glu Leu lie Ala 
65 70 75 80 

Ile~ Pro Asp Glu Arg lie Leu Ala lie Tyr Asn Ala Leu Arg Pro Phe 
85 90 95 

Arg Ser Ser Gin Ala Glu Leu Leu Ala lie Ala Asp Glu Leu Glu His 

100 - - - 105 -.110 

Thx-Trp His Ala Thr Val Asn Ala Ala Phe Val Arg Glu Ser Ala Glu 

115 120 : .- • 125 

Val Tyr Gin Gin Arg His Lys Leu Arg Lys Gly Ser 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 37: 

Xi) .^SEQUENCE CHARACTERISTICS: 

" (A) ' LENGTH : - 387 ;amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
. ID ) .... TOPOLOGY- " * unknown 



(ii) MOLECULE TYPE: protein 



(vi) 



ORIGINAL SOURCE: 
(A) ORGANISM: DHAT 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Met Ser Tyr Arg Met Phe Asp Tyr Leu Val Pro Asn Val Asn Phe Phe 
15 10 15 

Gly Pro Asn 'Ala lie Ser Val - Val Gly .Glu Arg Cys ...Gin- leu Leu Gly 
.20 - 25 30 

Gly Lys Lys -Ala Leu Leu Val Thr Asp Lys .Gly- Leu. Arg Ala lie Lys 
. 35 ' 40 . . 45 

Asp Gly Ala Val Asp Lys Thr Leu His Tyr Leu Arg Glu Ala Gly lie 
.50 55 60 

Glu Val Ala lie Phe Asp Gly Val Glu Pro Asn Pro Lys Asp Thr Asn 
65 70 75 80 

~Val Arg Asp TGI y Leu .Ala Val "Phe. Arg Arg Glu.. Gin Cys - .'Asp'. He ."Xle 
85 - 90 95 

Val Thr Val Gly Gly Gly Ser Pro His Asp Cys Gly Lys Gly lie Gly 
„ 100 105 110 

XHe Ala T JMarThr His. Glu .Gly-. AsplXeu JTyx. Gln~Tyr ^Ala.iSlyJXie Glu 
115 120 125 



Thr Leu Thr Asn Pro Leu Pro Pro lie Val Ala Val Asn Thr Thr Ala 
130 135 140 

Gly Thr Ala Ser' Glu Val'Thr Arg His Cys Val Leu Thr Ash Thr Glu 
145 150 155 160 

Thr. Lys Val.'Lys Phe Val He Val Ser Trp Arg Lys Leu Pro Ser Val 
. 165 170 175 

.Sex\.Xle Asn Asp Pro.'. .Leu Leu .Met". lie Gly: Lys Pro Ala "Ala leu .Thr 
180 . 185 "190 

Ala -Ala Thr Gly Met Asp Ala Leu Thr His .Ala Val Glu Ala Tyr lie 
195 " 200 205 



Ser Lys Asp Ala Asn Pro Val Thr Asp Ala Ala Ala Met Gin Ala lie 
210 215 220 

Arg ...leu. lie .Ala Arg Asn! leu Arg Gin Ala Val Ala Leu Gly.5er Asn 
225 230 235 240 

- Leu Gin Ala Arg Glu Asn Met Ala Tyr Ala. Ser Leu Leu Ala Gly Met 

. 245 ; ■ 250 . " 255 

Ala Phe Asn Asn Ala Asn Leu Gly Tyr Val His Ala Met Ala His Gin 
260 265 270 



Leu Gly Gly Leu Tyr Asp Met Pro His Gly Val Ala Asn Ala Val Leu 
275 280 285 
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Leu Pro His Val Ala Arg Tyr Asn Leu lie Ala Asn Pro Glu Lys Phe 
290 295 300 

Ala Asp lie Ala Glu Leu Met Gly Glu Asn lie Thr Gly Leu Ser Thr 

305 • — 310 1-315, , . ... 320 

.ieu^Asp .Ala, Ala -Glu Lys Ala lie Ala Ala* He Thr Arg Leu Ser Met 
325 - 330. 335 

-Asp . He Gly He Pro Gin His Leu Arg AspLieu .Gly Vail Lys Glu "Ala 
340 345 350 

Asp Phe Pro Tyr Met Ala Glu Met Ala Leu Lys Asp Gly Asn Ala Phe 
355 360 365 

Ser Asn Pro Arg Lys Gly Asn Glu Gin Glu He Ala Ala He Phe Arg 
370 375 380 

Gin Ala Phe ,> - - -v 

385 

,{2) INFORMATION . FOR SEQ -ID NO:38: : 

"! . A )~ ~ : ST2Q"03ENCE .CHARACTfUH S T I CS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( xx ) MOLECULE TYPE : DNA ( g enomi c ) 
(xi) SEQUENCE DESCRIPTION: SEQ ID "NO: 38: 
GCGAATT CAT GAGCTATCGT ATGTTTG 27 
■.(2)* *\ IN FORMATION * FOR " SEQ ID NO: 39: ; ~ .... 

( i ) SEQUENCES " CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

{xx) MOLECULE TYPE:: '-"DNA (genomic) 

(xi) "SEQUENCE DESCRIPTION: SEQ ID NO:39:" 
GCGAATT CAG AATGCCTGGC GGAAAATC , • 28 

(2) INFORMATION FOR SEQ ID NO: 40: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



WO 98/21341 



PCT/US97/20873 



98 - 



(ii) 



MOLECULE TYPE: DNA (genomic) 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



GGGAATTCAT. GAGCGAGAAA ACCATGCG 



.. 28 



(2 ) • \ ~~TNFDKMATI ON '.. FOR -SEQ ID NO: .41: 

- (i)-. . SEQUENCE CHARACTERISTICS: 
/ . (A) LENGTH: . 27 base pairs 

(B) " TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
~ U±} SEQUENCE: DESCRIPTION:. . SEQ ID WO: 41: 

GC GAATT CTT1" AGCTT C CTTT ATTGCAGC ' 27 

(2 )V .. ,XNFOT0yiATI ON FOR "-SEQ ID JNO : 4 2 : 

(17 . SEQUENCE -CHARACTERISTICS: 

(A) LENGTH:. 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii).. MOLECULE. TYPE: DNA (genomic) 

(Xi) .SEQUENCE DESCRIPTION: SEQ ID NO: 42 : 
GCGAATTCAT GCAACAGACA ACCCAAATTC 30 
[2).r UNFORMAT I ON IFOR SEQ ID NO: 43: 

Ti)~ SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

.(ii) MOLECULE TYPE: DNA (genomic) 
. . ~Txi). rS^QUENCE DESXTRIPTION::-' SEQ :1D NO: 43: 
GCGAATTCAC -TCCCTTACTA AGTCG . 25 

(21 INFORMATION FOR* 'SEQ TD "NO: 44 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
.EG GAATT CAT . GAAAAGATCA AAACGATTTG ^ , 
-■tZ).rZ~ INFi^BMATION .FOR SEQ ID. NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRXPTION.tT~_. SEQ ID NO- 4 5: 
1 GCGAKTX CTT * 1 ATT CAATGGT . " GTC G GG CT G _ 



*(2) "I N FORMAT J ON " FOTC SEQ ID NO: 4 6 ~~ 

( iT " SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(Xi) SEQUENCE DESCRIPTION:. SEQ^ID NO: 46 
-TTGATAATAT AACCATGGCT GCTGCTGCTG ATAG 
(2) " INFORMATION FOR SEQ -ID NO: 4*7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

. (ii.) MOLECULE TYPE:. DNA (genomic) 
. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 
_GTATGATATG _TTATCTTGGA-.TCCAATAAAT.. CTAATCTTC 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 8 

CATGACTAGT AAGGAGGACA ATTC 

(2) ** - INFORMATION FOR. SEQ ID NO:49: 

(i) SEQUENCE CHARACTERISTICS: 

" (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic) 
L. (xiF-: SEQtTENCE^DESCRXPTION:-* "SEQ TD NO: 49 
CATGGAATTG TCCTCCTTAC TAGT 
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WHAT IS CLAIMED IS: 

1. An improved method for the production of 1,3-propanediol from an organism capable of 
producing 1,3-propanediol, said organism comprising at least one gene encoding a 

, . dehydratase activity, the method comprising the steps of: 

(a) intro ducing a gene encoding protein X Into the organism to create a transformed 
- organism; and : 

(b) culturing thelransformed organism in the presence ,of at least one carbonjsource 
capable of being converted to 1,3 propanediol in said transformed host organism 
and under conditions suitable for the production of 1 ,3 propanediol wherein the 
carbon source is selected from the group consisting of monosaccharides, 
oligosaccharides, polysaccharides, and a one carbon substrate. 

2. The method of Claim i further comprising the step of introducing^ least one gene encoding 
-1: - a protein selected from the group consisting of protein 1, protein 2 and protein 3 into the 

organism. 

3. The method of Claim l further comprising the step of recovering the 1,3 propanediol. 

4. The method of Claim 1 wherein the gene encoding protein X is isolated from a glycerol 
dehydratase gene cluster. 

5. The method of Claim 1 wherein the gene encoding protein X is isolated from a diol 
dehydratase gene cluster. 

6. The method of Claim 4 wherein the glycerol dehydratase gene cluster is from an organism 
selected from the genera consisting of Klebsiella and Citrobactor. 

7. The method of Claim 5 wherein the diol dehydratase gene cluster is from an organism 
selected from the genera consisting of. Klebsiella, Clostridium and Salmonella. 

8. The method of Claim 1 wherein the gene encoding a dehydratase activity is heterologous to 
the organism. 

9. The method of Claim 1 wherein the gene encoding a dehydratase activity is homologous to 
the organism. 
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10. The method of Claim 1 wherein the organism is selected from the group of genera 
consisting of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, 
Aspergillus, Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, 
Kiuyverumyues, Candida, Hanse/UJla r I)ebaryomyDes, Mucor, TomJopsis, Methyfobacter, 
.£sx±tBrichja r JSalmonBlia, RaciUirs, Streptomyces and Pseudomonas. 

.11 .The method of Claim 10 wherenrthe organism is selected from the group consisting of 
E.coli and Klebsiella spp. 

12. The method of Claim 1 wherein the gene encoding protein X is stably maintained in the host 
genome. 

-13.. The method of Claim 2 wherein at least one gene encoding a protein selected from protein 
. 1, protein 2 and protein 3 is stably maintained in the host genome. 

14. The method olCIaim 1.^ 

15. The method of Claim 1 wherein the gene encoding protein X has the sequence as shown in 
SEQ ID NO: 59. 

16. The method of Claim 2 wherein protein 1 has the sequence as shown in SEQ ID NO: 60 or 
SEQ ID NO: 61. 

17. The method of Claim 2 whereiirprotein 2 has thesequence as-shown iriSEQ ID NO; 62 or 
SEQ ID NO: 63. 

18. The method of Claim 2 wherein protein 3 has the sequence as shown in SEQ ID NO:64 or 
SEQ ID NO: 65. 

.19.* A Tecombirrant microorganism capable of producing 1 ,^3-=propanetfiol f r om a carbon source 
said recombinant microorganism comprising a) at least one gene encoding a dehydratase 
•, : ractiyity;:b) at least one gene encoding a.glycerol-3-:phosphatase; and c) at least one gene 
encoding protein X. 

20. The recombinant microorganism of Claim 19 further comprising d) at least one gene 
encoding a protein selected from the group consisting of protein 1, protein 2 and protein 3. 
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21. The recombinant microorganism of Claim 19 selected from the group consisting of 
Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizosaccharomyces, Zygosaccharomyces, Pichia, Kluyveromyces, Candida, 
HansenuJa , Debaryomyces, JWucor; Torutopsis, Methytobader^ Escherichia , Salmonella, 
ftea'tius,£tmptomyT^^& : 

22. The recombinant microorganism of Claim 19 wherein the gene encoding protein X is 
isolated from a glycerol dehydratase gene cluster. 

23. The recombinant microorganism of Claim 19 wherein the gene encoding protein X is 
isolated from a diol dehydratase gene cluster. 

~*24 The recombinant microorganism of Claim 22 wherein the glycerol dehydratase gene cluster 
is froman-organism selected from the genera consisting of Klebsiella and Citrobactor. 

25. The recombinant m iut u u \ y a n rsnvof Cteim 23 wherein thetSiolTjehydratase gene cluster is 
from an organism selected from the genera consisting of Klebsiella, Clostridium and 
Salmonella. 

...26. "The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
.heterologous to^said microorganism. 

. 27.* The recombinant microorganism of Claim 19 wherein said dehydratase activity is 
homologous to said microorganism. 

28. The recombinant microorganism of Claim 19 wherein the gene encoding protein X has the 
sequence as shown in SEQ ID NO: 59. 

29. The recombinant microorganism of Claim 20 wherein protein 1 has the sequence as shown 
Hn SEQ TD NO: 60 or SEQ ID NO: 61. 

3d The method of ClaimiO wherein protein 2 has the sequence as^hownin SEQ ID NO: 62 or 
SEQ ID NO: 63. 
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31 . The method of Claim 20 wherein protein 3 has the sequence as shown in SEQ ID: 64 or 
SEQ ID NO: 65. 

32. A method for extending the halflife of dehydratase activity in a mi cr oor ga nism capable of 
producing 1,3-propanediol and containing at least oncogene encoding a. dehydratase activity, 
comprising the step of introducing a gene encoding protein X into said microorganism and 
cufturing under conditions suitable for production of 1 ^propanediol. 

33. The method of Claim 32 wherein the gene encoding the dehydratase activity is 
heterologous to said microorganism. 

34. The method of Claim 32 wherein the gene encoding the dehydratase activity is homologous 
to. said microorganism. 

.35; The-microorganism'of Claim 22 wherein toe gene encoding;protein X is isolated from a 
glycerol dehydratase gene cluster. 

36. The microorganism of Claim 32 wherein the gene encoding protein X is isolated from a diol 
dehydratase gene cluster. 

37. The microorganism of Claim 35 wherein the glycerol dehydratase gene cluster is from an 
.organism-selected from lhe genera consisting of Klebsiella and Citrobactor. 

38. The microorganism of Claim 34 wherein the diol dehydratase gene rluster is from an 
organism selected from the genera consisting of Klebsiella, Clostridium and Salmonella. 

39. The method of Claim 32 wherein the microorganism is selected from the group consisting 
of Citrobacter, Enterobacter, Clostridium, Klebsiella, Aerobacter, Lactobacillus, Aspergillus, 
Saccharomyces, Schizqsaccharomyces, Zygosaccharomyces, Pichia t KluyvBromyces, Candida, 
Hansenuta T DebaryomyceSi Muco'r. Torutopsis, Methytobatter^ £scherichia,J&afmonelia % 
Bacitius, Streptornyces and Pseudomonas. 

40. The method of Claim 32 further comprising the step of introducing a gene encoding at least 
one of protein 1, protein 2 and protein 3 into said microorganism. 
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